Thread: Schema variables - new implementation for Postgres 15

Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi,

I am returning back to implementation of schema variables. The schema variables can be used as an alternative to package variables (Oracle's PL/SQL or ADA). The schema variables can be used as fast and safe storage of session information for RLS too.

The previous implementation had not cleanly implemented execution of the LET statement. It was something between query and utility, and although it was working - it was out of Postgres concept (with different implementation of queries and utilities).

I totally rewrote the implementation of the LET statement. I prepared two variants:

First variant is based on the introduction of the new command type CMD_LET and new very small executor node SetVariable (this is a very very reduced analogy of ModifyTable node). The code is consistent and what is important - the LET statement can be prepared. The execution is relatively fast from PLpgSQL too. Without any special support the execution has the same speed like non simple queries. The statement reuses  an execution plan, but simple execution is not supported.

Second variant is implemented like a classic utility command. There is not any surprise. It is shorter, simple, but the LET statement cannot be prepared (this is the limit of all utility statements). Without special support in PLpgSQL the execution is about 10x slower than the execution of the first variant. But there is a new possibility of using the main parser from PLpgSQL (implemented by Tom for new implementation of assign statement in pg 14), and then this support in plpgsql requires only a few lines). When the statement LET is explicitly supported by PLpgSQL, then execution is very fast (the speed is comparable with the speed of the assign statement) - it is about 10x faster than the first variant.

I tested code

do $$
declare x int ;
begin
  for i in 1..1000000
  loop
    let ooo = i;
  end loop;
end;
$$;

variant 1 .. 1500 ms
variant 2 with PLpgSQL support .. 140 ms
variant 2 without PLpgSQL support 9000 ms

The slower speed of the first variant from PLpgSQL can be fixed. But for this moment, the speed is good enough. This is the worst case, because in the first variant LET statement cannot use optimization for simple query evaluation (now).

Now I think so implementation is significantly cleaner, and I hope so it will be more acceptable for committers.

I am starting a new thread, because this is a new implementation, and because I am sending two alternative implementations of one functionality.

Comments, notes, objections?

Regards

Pavel


Attachment

RE: Schema variables - new implementation for Postgres 15

From
"tsunakawa.takay@fujitsu.com"
Date:

From: Pavel Stehule <pavel.stehule@gmail.com>

--------------------------------------------------

do $$

declare x int ;

begin

  for i in 1..1000000

  loop

    let ooo = i;

  end loop;

end;

$$;

 

variant 1 .. 1500 ms

variant 2 with PLpgSQL support .. 140 ms

variant 2 without PLpgSQL support 9000 ms

--------------------------------------------------

 

 

That's impressive!  But 1 million times of variable assignment took only 140 ms?  It's that one assignment took only 140 nanosecond, which is near one DRAM access?  Can PL/pgSQL processing be really so fast?

 

 

Regards

Takayuki Tsunakawa

 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 15. 4. 2021 v 18:02 odesílatel tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com> napsal:

From: Pavel Stehule <pavel.stehule@gmail.com>

--------------------------------------------------

do $$

declare x int ;

begin

  for i in 1..1000000

  loop

    let ooo = i;

  end loop;

end;

$$;

 

variant 1 .. 1500 ms

variant 2 with PLpgSQL support .. 140 ms

variant 2 without PLpgSQL support 9000 ms

--------------------------------------------------

 

 

That's impressive!  But 1 million times of variable assignment took only 140 ms?  It's that one assignment took only 140 nanosecond, which is near one DRAM access?  Can PL/pgSQL processing be really so fast?


In this case the PLpgSQL can be very fast - and after changes in pg 13, the PLpgSQL is not significantly slower than Lua or than PHP.

Every body can repeat these tests - I did it on my Lenovo T520 notebook

Pavel


 

 

Regards

Takayuki Tsunakawa

 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 15. 4. 2021 v 10:42 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi,

I am returning back to implementation of schema variables. The schema variables can be used as an alternative to package variables (Oracle's PL/SQL or ADA). The schema variables can be used as fast and safe storage of session information for RLS too.

The previous implementation had not cleanly implemented execution of the LET statement. It was something between query and utility, and although it was working - it was out of Postgres concept (with different implementation of queries and utilities).

I totally rewrote the implementation of the LET statement. I prepared two variants:

First variant is based on the introduction of the new command type CMD_LET and new very small executor node SetVariable (this is a very very reduced analogy of ModifyTable node). The code is consistent and what is important - the LET statement can be prepared. The execution is relatively fast from PLpgSQL too. Without any special support the execution has the same speed like non simple queries. The statement reuses  an execution plan, but simple execution is not supported.

Second variant is implemented like a classic utility command. There is not any surprise. It is shorter, simple, but the LET statement cannot be prepared (this is the limit of all utility statements). Without special support in PLpgSQL the execution is about 10x slower than the execution of the first variant. But there is a new possibility of using the main parser from PLpgSQL (implemented by Tom for new implementation of assign statement in pg 14), and then this support in plpgsql requires only a few lines). When the statement LET is explicitly supported by PLpgSQL, then execution is very fast (the speed is comparable with the speed of the assign statement) - it is about 10x faster than the first variant.

I tested code

do $$
declare x int ;
begin
  for i in 1..1000000
  loop
    let ooo = i;
  end loop;
end;
$$;

variant 1 .. 1500 ms
variant 2 with PLpgSQL support .. 140 ms
variant 2 without PLpgSQL support 9000 ms

The slower speed of the first variant from PLpgSQL can be fixed. But for this moment, the speed is good enough. This is the worst case, because in the first variant LET statement cannot use optimization for simple query evaluation (now).

Now I think so implementation is significantly cleaner, and I hope so it will be more acceptable for committers.

I am starting a new thread, because this is a new implementation, and because I am sending two alternative implementations of one functionality.

Comments, notes, objections?


I am sending only one patch and I assign this thread to commitfest application

Regards

Pavel
 
Regards

Pavel


Attachment

Re: Schema variables - new implementation for Postgres 15

From
"Joel Jacobson"
Date:
On Thu, Apr 15, 2021, at 10:42, Pavel Stehule wrote:
Attachments:
  • schema-variables-v-execnode-2021-01.patch
  • schema-variables-v-utility-2021-01.patch

Applications are currently know to be misusing set_config()+current_setting() to pass information in a session or transaction.

Such users might be interested in using Schema variables as a better replacement.

However, since set_config() is transactional, it can't be used as a drop-in replacement:

+   <para>
+    The value of a schema variable is local to the current session. Retrieving
+    a variable's value returns either a NULL or a default value, unless its value
+    is set to something else in the current session with a LET command. The content
+    of a variable is not transactional. This is the same as in regular variables
+    in PL languages.
+   </para>

I think the "The content of a variable is not transactional." part is therefore a bad idea.

Another pattern is to use TEMP TABLEs to pass around information in a session or transaction.
If the LET command would be transactional, it could be used as a drop-in replacement for such use-cases as well.

Furthermore, I think a non-transactional LET command would be insidious,
since it looks like any other SQL command, all of which are transactional.
(The ones that aren't such as REINDEX CONCURRENTLY will properly throw an error if run inside a transaction block.)

A non-transactional LET command IMO would be non-SQL-idiomatic and non-intuitive.

/Joel



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 16. 4. 2021 v 8:07 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Thu, Apr 15, 2021, at 10:42, Pavel Stehule wrote:
Attachments:
  • schema-variables-v-execnode-2021-01.patch
  • schema-variables-v-utility-2021-01.patch

Applications are currently know to be misusing set_config()+current_setting() to pass information in a session or transaction.

Such users might be interested in using Schema variables as a better replacement.

However, since set_config() is transactional, it can't be used as a drop-in replacement:

+   <para>
+    The value of a schema variable is local to the current session. Retrieving
+    a variable's value returns either a NULL or a default value, unless its value
+    is set to something else in the current session with a LET command. The content
+    of a variable is not transactional. This is the same as in regular variables
+    in PL languages.
+   </para>

I think the "The content of a variable is not transactional." part is therefore a bad idea.

Another pattern is to use TEMP TABLEs to pass around information in a session or transaction.
If the LET command would be transactional, it could be used as a drop-in replacement for such use-cases as well.

Furthermore, I think a non-transactional LET command would be insidious,
since it looks like any other SQL command, all of which are transactional.
(The ones that aren't such as REINDEX CONCURRENTLY will properly throw an error if run inside a transaction block.)

A non-transactional LET command IMO would be non-SQL-idiomatic and non-intuitive.

I am sorry, but in this topic we have different opinions. The variables in PLpgSQL are not transactional too (same is true in Perl, Python, ...). Session variables in Oracle, MS SQL, DB2, MySQL are not transactional too. My primary focus is PLpgSQL - and I would like to use schema variables as global plpgsql variables (from PLpgSQL perspective) - that means in Postgres's perspective session variables. But in Postgres, I have to write features that will work with others PL too - PLPython, PLPerl, ... Statement SET in ANSI/SQL standard (SQL/PSM) doesn't expect transactional behaviour for variables too. Unfortunately SET keyword is used in Postgres for GUC, and isn't possible to reuse without a compatibility break.

The PostgreSQL configuration is transactional, but it is a different feature designed for different purposes. Using GUC like session variables is just a workaround. It can be useful for some cases, sure. But it is not usual behaviour. And for other cases the transactional behaviour is not practical. Schema variables are not replacement of GUC, schema variables are not replacement of temporal tables. There is a prepared patch for global temp tables. I hope so this patch can be committed to Postgres 15. Global temp tables fixes almost all disadvantages of temporary tables in Postgres. So the schema variable is not a one row table. It is a different creature - designed to support the server's side procedural features.

I have prepared a patch that allows non default transactional behaviour (but this behaviour should not be default - I didn't design schema variables as temp tables replacement). This patch increases the length of the current patch about 1/4, and I have enough work with rebasing with the current patch, so I didn't send it to commitfest now. If schema variables will be inside core, this day I'll send the patch that allows transactional behaviour for schema variables - I promise.

Regards

Pavel




/Joel



RE: Schema variables - new implementation for Postgres 15

From
"tsunakawa.takay@fujitsu.com"
Date:

From: Pavel Stehule <pavel.stehule@gmail.com>

--------------------------------------------------

I am sorry, but in this topic we have different opinions. The variables in PLpgSQL are not transactional too (same is true in Perl, Python, ...). Session variables in Oracle, MS SQL, DB2, MySQL are not transactional too. My primary focus is PLpgSQL - and I would like to use schema variables as global plpgsql variables (from PLpgSQL perspective) - that means in Postgres's perspective session variables. But in Postgres, I have to write features that will work with others PL too - PLPython, PLPerl, ... Statement SET in ANSI/SQL standard (SQL/PSM) doesn't expect transactional behaviour for variables too. Unfortunately SET keyword is used in Postgres for GUC, and isn't possible to reuse without a compatibility break.

 

The PostgreSQL configuration is transactional, but it is a different feature designed for different purposes. Using GUC like session variables is just a workaround. It can be useful for some cases, sure. But it is not usual behaviour. And for other cases the transactional behaviour is not practical. Schema variables are not replacement of GUC, schema variables are not replacement of temporal tables. There is a prepared patch for global temp tables. I hope so this patch can be committed to Postgres 15. Global temp tables fixes almost all disadvantages of temporary tables in Postgres. So the schema variable is not a one row table. It is a different creature - designed to support the server's side procedural features.

--------------------------------------------------

 

+1

I understand (and wish) this feature is intended to ease migration from Oracle PL/SQL, which will further increase the popularity of Postgres.  So, the transactional behavior is not necessary unless Oracle has such a feature.

 

Furthermore, Postgres already has some non-transactonal SQL commands.  So, I don't think we need to reject non-transactional LET.

 

* Sequence operation: SELECT nextval/setval

* SET [SESSION]

* SET ROLE

* SET SESSION AUTHORIZATION

 

 

--------------------------------------------------

I have prepared a patch that allows non default transactional behaviour (but this behaviour should not be default - I didn't design schema variables as temp tables replacement). This patch increases the length of the current patch about 1/4, and I have enough work with rebasing with the current patch, so I didn't send it to commitfest now. If schema variables will be inside core, this day I'll send the patch that allows transactional behaviour for schema variables - I promise.

--------------------------------------------------

 

I prefer the simpler, targeted one without transactional behavior initially, because added complexity might prevent this feature from being committed in PG 15.

 

 

Regards

Takayuki Tsunakawa

 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

only rebase

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 12. 5. 2021 v 6:17 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

only rebase

second try - rebase after serial_scheduler remove

Regards

Pavel


Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 12. 5. 2021 v 7:37 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


st 12. 5. 2021 v 6:17 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

only rebase

second try - rebase after serial_scheduler remove

only rebase

Regards

Pavel


Regards

Pavel


Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

rebase

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 12. 6. 2021 v 8:00 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

rebase


rebase only

Regards

Pavel


Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Gilles Darold
Date:
Hi,

Review resume:


This patch implements Schema Variables that are database objects that can hold a single or composite value following the data type used at variable declaration. Schema variables, like relations, exist within a schema and their access is controlled via GRANT and REVOKE commands. The schema variable can be created by the CREATE VARIABLE command, altered using ALTER VARIABLE and removed using DROP VARIABLE.

The value of a schema variable is local to the current session. Retrieving a variable's value returns either a NULL or a default value, unless its value is set to something else in the current session with a LET command. The content of a variable is not transactional. This is the same as in regular variables in PL languages.

Schema variables are retrieved by the SELECT SQL command. Their value is set with the LET SQL command. While schema variables share properties with tables, their value cannot be updated with an UPDATE command.


The patch apply with the patch command without problem and compilation reports no warning or errors. Regression tests pass successfully using make check or make installcheck
It also includes all documentation and regression tests.

Performances are near the set of plpgsql variable settings which is impressive:

do $$
declare var1 int ; i int;
begin
  for i in 1..1000000
  loop
    var1 := i;
  end loop;
end;
$$;
DO
Time: 71,515 ms

CREATE VARIABLE var1 AS integer;
do $$
declare i int ;
begin
  for i in 1..1000000
  loop
    let var1 = i;
  end loop;
end;
$$;
DO
Time: 94,658 ms

There is just one thing that puzzles me. We can use :

    CREATE VARIABLE var1 AS date NOT NULL;
    postgres=# SELECT var1;
    ERROR:  null value is not allowed for NOT NULL schema variable "var1"

which I understand and is the right behavior. But if we use:

    CREATE IMMUTABLE VARIABLE var1 AS date NOT NULL;
    postgres=# SELECT var1;
    ERROR:  null value is not allowed for NOT NULL schema variable "var1"
    DETAIL:  The schema variable was not initialized yet.
    postgres=# LET var1=current_date;
    ERROR:  schema variable "var1" is declared IMMUTABLE

It should probably be better to not allow NOT NULL when IMMUTABLE is used because the variable can not be used at all.  Also probably IMMUTABLE without a DEFAULT value should also be restricted as it makes no sens. If the user wants the variable to be NULL he must use DEFAULT NULL. This is just a though, the above error messages are explicit and the user can understand what wrong declaration he have done.

Except that I think this patch is ready for committers, so if there is no other opinion in favor of restricting the use of IMMUTABLE with NOT NULL and DEFAULT I will change the status to ready for committers.

-- 
Gilles Darold
http://www.darold.net/

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 28. 8. 2021 v 11:57 odesílatel Gilles Darold <gilles@darold.net> napsal:
Hi,

Review resume:


This patch implements Schema Variables that are database objects that can hold a single or composite value following the data type used at variable declaration. Schema variables, like relations, exist within a schema and their access is controlled via GRANT and REVOKE commands. The schema variable can be created by the CREATE VARIABLE command, altered using ALTER VARIABLE and removed using DROP VARIABLE.

The value of a schema variable is local to the current session. Retrieving a variable's value returns either a NULL or a default value, unless its value is set to something else in the current session with a LET command. The content of a variable is not transactional. This is the same as in regular variables in PL languages.

Schema variables are retrieved by the SELECT SQL command. Their value is set with the LET SQL command. While schema variables share properties with tables, their value cannot be updated with an UPDATE command.


The patch apply with the patch command without problem and compilation reports no warning or errors. Regression tests pass successfully using make check or make installcheck
It also includes all documentation and regression tests.

Performances are near the set of plpgsql variable settings which is impressive:

do $$
declare var1 int ; i int;
begin
  for i in 1..1000000
  loop
    var1 := i;
  end loop;
end;
$$;
DO
Time: 71,515 ms

CREATE VARIABLE var1 AS integer;
do $$
declare i int ;
begin
  for i in 1..1000000
  loop
    let var1 = i;
  end loop;
end;
$$;
DO
Time: 94,658 ms

There is just one thing that puzzles me. We can use :

    CREATE VARIABLE var1 AS date NOT NULL;
    postgres=# SELECT var1;
    ERROR:  null value is not allowed for NOT NULL schema variable "var1"

which I understand and is the right behavior. But if we use:

    CREATE IMMUTABLE VARIABLE var1 AS date NOT NULL;
    postgres=# SELECT var1;
    ERROR:  null value is not allowed for NOT NULL schema variable "var1"
    DETAIL:  The schema variable was not initialized yet.
    postgres=# LET var1=current_date;
    ERROR:  schema variable "var1" is declared IMMUTABLE

It should probably be better to not allow NOT NULL when IMMUTABLE is used because the variable can not be used at all.  Also probably IMMUTABLE without a DEFAULT value should also be restricted as it makes no sens. If the user wants the variable to be NULL he must use DEFAULT NULL. This is just a though, the above error messages are explicit and the user can understand what wrong declaration he have done.

I thought about this case, and I have one scenario, where this behaviour can be useful. When the variable is declared as IMMUTABLE NOT NULL without not null default, then any access to the content of the variable has to fail. I think it can be used for detection, where and when the variable is first used. So this behavior is allowed just because I think, so this feature can be interesting for debugging. If this idea is too strange, I have no problem to disable this case.

Regards

Pavel
 

Except that I think this patch is ready for committers, so if there is no other opinion in favor of restricting the use of IMMUTABLE with NOT NULL and DEFAULT I will change the status to ready for committers.

-- 
Gilles Darold
http://www.darold.net/

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

so 28. 8. 2021 v 11:57 odesílatel Gilles Darold <gilles@darold.net> napsal:
Hi,

Review resume:


This patch implements Schema Variables that are database objects that can hold a single or composite value following the data type used at variable declaration. Schema variables, like relations, exist within a schema and their access is controlled via GRANT and REVOKE commands. The schema variable can be created by the CREATE VARIABLE command, altered using ALTER VARIABLE and removed using DROP VARIABLE.

The value of a schema variable is local to the current session. Retrieving a variable's value returns either a NULL or a default value, unless its value is set to something else in the current session with a LET command. The content of a variable is not transactional. This is the same as in regular variables in PL languages.

Schema variables are retrieved by the SELECT SQL command. Their value is set with the LET SQL command. While schema variables share properties with tables, their value cannot be updated with an UPDATE command.


The patch apply with the patch command without problem and compilation reports no warning or errors. Regression tests pass successfully using make check or make installcheck
It also includes all documentation and regression tests.

Performances are near the set of plpgsql variable settings which is impressive:

do $$
declare var1 int ; i int;
begin
  for i in 1..1000000
  loop
    var1 := i;
  end loop;
end;
$$;
DO
Time: 71,515 ms

CREATE VARIABLE var1 AS integer;
do $$
declare i int ;
begin
  for i in 1..1000000
  loop
    let var1 = i;
  end loop;
end;
$$;
DO
Time: 94,658 ms

There is just one thing that puzzles me. We can use :

    CREATE VARIABLE var1 AS date NOT NULL;
    postgres=# SELECT var1;
    ERROR:  null value is not allowed for NOT NULL schema variable "var1"

which I understand and is the right behavior. But if we use:

    CREATE IMMUTABLE VARIABLE var1 AS date NOT NULL;
    postgres=# SELECT var1;
    ERROR:  null value is not allowed for NOT NULL schema variable "var1"
    DETAIL:  The schema variable was not initialized yet.
    postgres=# LET var1=current_date;
    ERROR:  schema variable "var1" is declared IMMUTABLE

It should probably be better to not allow NOT NULL when IMMUTABLE is used because the variable can not be used at all.  Also probably IMMUTABLE without a DEFAULT value should also be restricted as it makes no sens. If the user wants the variable to be NULL he must use DEFAULT NULL. This is just a though, the above error messages are explicit and the user can understand what wrong declaration he have done.

I wrote a check that disables this case.  Please, see the attached patch. I agree, so this case is confusing, and it is better to disable it.

Regards

Pavel


Except that I think this patch is ready for committers, so if there is no other opinion in favor of restricting the use of IMMUTABLE with NOT NULL and DEFAULT I will change the status to ready for committers.

-- 
Gilles Darold
http://www.darold.net/
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Gilles Darold
Date:
Le 08/09/2021 à 13:41, Pavel Stehule a écrit :
Hi

so 28. 8. 2021 v 11:57 odesílatel Gilles Darold <gilles@darold.net> napsal:
Hi,

Review resume:


This patch implements Schema Variables that are database objects that can hold a single or composite value following the data type used at variable declaration. Schema variables, like relations, exist within a schema and their access is controlled via GRANT and REVOKE commands. The schema variable can be created by the CREATE VARIABLE command, altered using ALTER VARIABLE and removed using DROP VARIABLE.

The value of a schema variable is local to the current session. Retrieving a variable's value returns either a NULL or a default value, unless its value is set to something else in the current session with a LET command. The content of a variable is not transactional. This is the same as in regular variables in PL languages.

Schema variables are retrieved by the SELECT SQL command. Their value is set with the LET SQL command. While schema variables share properties with tables, their value cannot be updated with an UPDATE command.


The patch apply with the patch command without problem and compilation reports no warning or errors. Regression tests pass successfully using make check or make installcheck
It also includes all documentation and regression tests.

Performances are near the set of plpgsql variable settings which is impressive:

do $$
declare var1 int ; i int;
begin
  for i in 1..1000000
  loop
    var1 := i;
  end loop;
end;
$$;
DO
Time: 71,515 ms

CREATE VARIABLE var1 AS integer;
do $$
declare i int ;
begin
  for i in 1..1000000
  loop
    let var1 = i;
  end loop;
end;
$$;
DO
Time: 94,658 ms

There is just one thing that puzzles me. We can use :

    CREATE VARIABLE var1 AS date NOT NULL;
    postgres=# SELECT var1;
    ERROR:  null value is not allowed for NOT NULL schema variable "var1"

which I understand and is the right behavior. But if we use:

    CREATE IMMUTABLE VARIABLE var1 AS date NOT NULL;
    postgres=# SELECT var1;
    ERROR:  null value is not allowed for NOT NULL schema variable "var1"
    DETAIL:  The schema variable was not initialized yet.
    postgres=# LET var1=current_date;
    ERROR:  schema variable "var1" is declared IMMUTABLE

It should probably be better to not allow NOT NULL when IMMUTABLE is used because the variable can not be used at all.  Also probably IMMUTABLE without a DEFAULT value should also be restricted as it makes no sens. If the user wants the variable to be NULL he must use DEFAULT NULL. This is just a though, the above error messages are explicit and the user can understand what wrong declaration he have done.

I wrote a check that disables this case.  Please, see the attached patch. I agree, so this case is confusing, and it is better to disable it.


Great, I also think that this is better to not confuse the user.

    postgres=# CREATE IMMUTABLE VARIABLE var1 AS date NOT NULL;
    ERROR:  IMMUTABLE NOT NULL variable requires default expression

Working as expected. I have moved the patch to "Ready for committers". Thanks for this feature.


-- 
Gilles Darold
http://www.darold.net/

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:

Hi

Great, I also think that this is better to not confuse the user.

    postgres=# CREATE IMMUTABLE VARIABLE var1 AS date NOT NULL;
    ERROR:  IMMUTABLE NOT NULL variable requires default expression

Working as expected. I have moved the patch to "Ready for committers". Thanks for this feature.


Thank you very much

Pavel


-- 
Gilles Darold
http://www.darold.net/

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

fresh rebase

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
 > [schema-variables-20210909.patch]

Hi Pavel,

The patch applies and compiles fine but 'make check' for the 
assert-enabled fails on 131 out of 210 tests.

(while compiling HEAD checks run without errors for both assert-disabled 
and assert-enabled)


Erik Rijkers


test tablespace                   ... ok          303 ms
parallel group (20 tests):  oid char pg_lsn int2 varchar txid int4 
regproc uuid float4 text name money boolean bit float8 int8 enum numeric 
rangetypes
      boolean                      ... ok          112 ms
      char                         ... ok           57 ms
      name                         ... ok          106 ms
      varchar                      ... ok           74 ms
      text                         ... ok          106 ms
      int2                         ... ok           73 ms
      int4                         ... ok           92 ms
      int8                         ... ok          130 ms
      oid                          ... ok           55 ms
      float4                       ... ok          102 ms
      float8                       ... ok          126 ms
      bit                          ... ok          124 ms
      numeric                      ... ok          362 ms
      txid                         ... ok           87 ms
      uuid                         ... ok          100 ms
      enum                         ... ok          142 ms
      money                        ... ok          109 ms
      rangetypes                   ... ok          433 ms
      pg_lsn                       ... ok           64 ms
      regproc                      ... ok           91 ms
parallel group (20 tests):  lseg path circle time macaddr 
create_function_0 timetz line macaddr8 numerology point interval inet 
date strings polygon box multirangetypes timestamp timestamptz
      strings                      ... ok          166 ms
      numerology                   ... ok           89 ms
      point                        ... ok           96 ms
      lseg                         ... ok           35 ms
      line                         ... ok           70 ms
      box                          ... ok          255 ms
      path                         ... ok           50 ms
      polygon                      ... ok          237 ms
      circle                       ... ok           53 ms
      date                         ... ok          127 ms
      time                         ... ok           60 ms
      timetz                       ... ok           67 ms
      timestamp                    ... ok          379 ms
      timestamptz                  ... ok          413 ms
      interval                     ... ok           97 ms
      inet                         ... ok          118 ms
      macaddr                      ... ok           60 ms
      macaddr8                     ... ok           80 ms
      multirangetypes              ... ok          307 ms
      create_function_0            ... ok           63 ms
parallel group (12 tests):  comments unicode misc_sanity tstypes xid 
expressions horology geometry mvcc type_sanity regex opr_sanity
      geometry                     ... ok          140 ms
      horology                     ... ok          120 ms
      tstypes                      ... ok           53 ms
      regex                        ... ok          335 ms
      type_sanity                  ... ok          155 ms
      opr_sanity                   ... ok          355 ms
      misc_sanity                  ... ok           43 ms
      comments                     ... ok           20 ms
      expressions                  ... ok          100 ms
      unicode                      ... ok           25 ms
      xid                          ... ok           56 ms
      mvcc                         ... ok          146 ms
test create_function_1            ... ok           10 ms
test create_type                  ... ok           30 ms
test create_table                 ... ok          333 ms
test create_function_2            ... ok           11 ms
parallel group (5 tests):  copydml copyselect insert_conflict insert copy
      copy                         ... ok          336 ms
      copyselect                   ... ok           34 ms
      copydml                      ... ok           28 ms
      insert                       ... ok          291 ms
      insert_conflict              ... FAILED (test process exited with 
exit code 2)      239 ms
parallel group (3 tests):  create_operator create_procedure create_misc
      create_misc                  ... ok          131 ms
      create_operator              ... ok           29 ms
      create_procedure             ... ok           52 ms
parallel group (5 tests):  create_view create_index_spgist 
index_including create_index index_including_gist
      create_index                 ... FAILED (test process exited with 
exit code 2)     3801 ms
      create_index_spgist          ... ok          523 ms
      create_view                  ... FAILED (test process exited with 
exit code 2)      339 ms
      index_including              ... FAILED (test process exited with 
exit code 2)     3801 ms
      index_including_gist         ... FAILED (test process exited with 
exit code 2)     3801 ms
parallel group (16 tests):  create_aggregate create_cast typed_table 
drop_if_exists roleattributes create_am hash_func updatable_views errors 
infinite_recurse create_function_3 triggers constraints select inherit 
vacuum
      create_aggregate             ... FAILED (test process exited with 
exit code 2)      164 ms
      create_function_3            ... FAILED (test process exited with 
exit code 2)      164 ms
      create_cast                  ... FAILED (test process exited with 
exit code 2)      164 ms
      constraints                  ... FAILED (test process exited with 
exit code 2)      181 ms
      triggers                     ... FAILED (test process exited with 
exit code 2)      181 ms
      select                       ... FAILED (test process exited with 
exit code 2)      181 ms
      inherit                      ... FAILED (test process exited with 
exit code 2)      181 ms
      typed_table                  ... FAILED (test process exited with 
exit code 2)      163 ms
      vacuum                       ... FAILED (test process exited with 
exit code 2)      180 ms
      drop_if_exists               ... FAILED (test process exited with 
exit code 2)      163 ms
      updatable_views              ... FAILED (test process exited with 
exit code 2)      163 ms
      roleattributes               ... FAILED (test process exited with 
exit code 2)      163 ms
      create_am                    ... FAILED (test process exited with 
exit code 2)      163 ms
      hash_func                    ... FAILED (test process exited with 
exit code 2)      162 ms
      errors                       ... FAILED (test process exited with 
exit code 2)      162 ms
      infinite_recurse             ... FAILED (test process exited with 
exit code 2)      162 ms
test sanity_check                 ... FAILED (test process exited with 
exit code 2)       26 ms
parallel group (20 tests):  select_into subselect select_distinct arrays 
join namespace hash_index select_having portals transactions aggregates 
random update delete union btree_index select_implicit 
select_distinct_on prepared_xacts case
      select_into                  ... FAILED (test process exited with 
exit code 2)       20 ms
      select_distinct              ... FAILED (test process exited with 
exit code 2)       21 ms
      select_distinct_on           ... FAILED (test process exited with 
exit code 2)       26 ms
      select_implicit              ... FAILED (test process exited with 
exit code 2)       26 ms
      select_having                ... FAILED (test process exited with 
exit code 2)       23 ms
      subselect                    ... FAILED (test process exited with 
exit code 2)       20 ms
      union                        ... FAILED (test process exited with
exit code 2)       25 ms
      case                         ... FAILED (test process exited with 
exit code 2)       27 ms
      join                         ... FAILED (test process exited with 
exit code 2)       22 ms
      aggregates                   ... FAILED (test process exited with 
exit code 2)       24 ms
      transactions                 ... FAILED (test process exited with 
exit code 2)       24 ms
      random                       ... failed (ignored) (test process 
exited with exit code 2)       24 ms
      portals                      ... FAILED (test process exited with 
exit code 2)       23 ms
      arrays                       ... FAILED (test process exited with 
exit code 2)       20 ms
      btree_index                  ... FAILED (test process exited with 
exit code 2)       25 ms
      hash_index                   ... FAILED (test process exited with 
exit code 2)       22 ms
      update                       ... FAILED (test process exited with 
exit code 2)       23 ms
      delete                       ... FAILED (test process exited with 
exit code 2)       24 ms
      namespace                    ... FAILED (test process exited with 
exit code 2)       21 ms
      prepared_xacts               ... FAILED (test process exited with 
exit code 2)       25 ms
parallel group (20 tests):  gist brin identity generated password 
tablesample lock matview replica_identity rowsecurity security_label 
object_address drop_operator groupingsets join_hash privileges collate 
init_privs spgist gin
      brin                         ... FAILED (test process exited with 
exit code 2)       15 ms
      gin                          ... FAILED (test process exited with 
exit code 2)       22 ms
      gist                         ... FAILED (test process exited with 
exit code 2)       13 ms
      spgist                       ... FAILED (test process exited with 
exit code 2)       22 ms
      privileges                   ... FAILED (test process exited with 
exit code 2)       19 ms
      init_privs                   ... FAILED (test process exited with 
exit code 2)       21 ms
      security_label               ... FAILED (test process exited with 
exit code 2)       17 ms
      collate                      ... FAILED (test process exited with 
exit code 2)       20 ms
      matview                      ... FAILED (test process exited with 
exit code 2)       17 ms
      lock                         ... FAILED (test process exited with 
exit code 2)       15 ms
      replica_identity             ... FAILED (test process exited with 
exit code 2)       17 ms
      rowsecurity                  ... FAILED (test process exited with 
exit code 2)       17 ms
      object_address               ... FAILED (test process exited with 
exit code 2)       17 ms
      tablesample                  ... FAILED (test process exited with 
exit code 2)       15 ms
      groupingsets                 ... FAILED (test process exited with 
exit code 2)       17 ms
      drop_operator                ... FAILED (test process exited with 
exit code 2)       17 ms
      password                     ... FAILED (test process exited with 
exit code 2)       14 ms
      identity                     ... FAILED (test process exited with 
exit code 2)       13 ms
      generated                    ... FAILED (test process exited with 
exit code 2)       14 ms
      join_hash                    ... FAILED (test process exited with 
exit code 2)       18 ms
parallel group (2 tests):  brin_multi brin_bloom
      brin_bloom                   ... FAILED (test process exited with 
exit code 2)        4 ms
      brin_multi                   ... FAILED (test process exited with 
exit code 2)        4 ms
parallel group (14 tests):  async create_table_like collate.icu.utf8 
misc sysviews alter_operator tidscan tidrangescan alter_generic tsrf 
incremental_sort misc_functions tid dbsize
      create_table_like            ... FAILED (test process exited with 
exit code 2)       10 ms
      alter_generic                ... FAILED (test process exited with 
exit code 2)       15 ms
      alter_operator               ... FAILED (test process exited with 
exit code 2)       13 ms
      misc                         ... FAILED (test process exited with 
exit code 2)       11 ms
      async                        ... FAILED (test process exited with 
exit code 2)        9 ms
      dbsize                       ... FAILED (test process exited with 
exit code 2)       17 ms
      misc_functions               ... FAILED (test process exited with 
exit code 2)       14 ms
      sysviews                     ... FAILED (test process exited with 
exit code 2)       12 ms
      tsrf                         ... FAILED (test process exited with 
exit code 2)       14 ms
      tid                          ... FAILED (test process exited with 
exit code 2)       16 ms
      tidscan                      ... FAILED (test process exited with 
exit code 2)       13 ms
      tidrangescan                 ... FAILED (test process exited with 
exit code 2)       13 ms
      collate.icu.utf8             ... FAILED (test process exited with 
exit code 2)        9 ms
      incremental_sort             ... FAILED (test process exited with 
exit code 2)       13 ms
parallel group (6 tests):  amutils psql_crosstab collate.linux.utf8 psql 
stats_ext rules
      rules                        ... FAILED (test process exited with 
exit code 2)        8 ms
      psql                         ... FAILED (test process exited with 
exit code 2)        7 ms
      psql_crosstab                ... FAILED (test process exited with 
exit code 2)        7 ms
      amutils                      ... FAILED (test process exited with 
exit code 2)        7 ms
      stats_ext                    ... FAILED (test process exited with 
exit code 2)        8 ms
      collate.linux.utf8           ... FAILED (test process exited with 
exit code 2)        7 ms
test select_parallel              ... FAILED (test process exited with 
exit code 2)        3 ms
test write_parallel               ... FAILED (test process exited with 
exit code 2)        3 ms
parallel group (2 tests):  publication subscription
      publication                  ... FAILED (test process exited with 
exit code 2)        4 ms
      subscription                 ... FAILED (test process exited with 
exit code 2)        4 ms
parallel group (17 tests):  select_views foreign_key xmlmap window 
functional_deps tsearch cluster combocid bitmapops tsdicts equivclass 
guc indirect_toast advisory_lock foreign_data dependency portals_p2
      select_views                 ... FAILED (test process exited with 
exit code 2)       11 ms
      portals_p2                   ... FAILED (test process exited with 
exit code 2)       20 ms
      foreign_key                  ... FAILED (test process exited with 
exit code 2)       11 ms
      cluster                      ... FAILED (test process exited with 
exit code 2)       15 ms
      dependency                   ... FAILED (test process exited with 
exit code 2)       19 ms
      guc                          ... FAILED (test process exited with 
exit code 2)       16 ms
      bitmapops                    ... FAILED (test process exited with 
exit code 2)       16 ms
      combocid                     ... FAILED (test process exited with 
exit code 2)       15 ms
      tsearch                      ... FAILED (test process exited with 
exit code 2)       14 ms
      tsdicts                      ... FAILED (test process exited with 
exit code 2)       16 ms
      foreign_data                 ... FAILED (test process exited with 
exit code 2)       17 ms
      window                       ... FAILED (test process exited with 
exit code 2)       12 ms
      xmlmap                       ... FAILED (test process exited with 
exit code 2)       12 ms
      functional_deps              ... FAILED (test process exited with 
exit code 2)       13 ms
      advisory_lock                ... FAILED (test process exited with
exit code 2)       16 ms
      indirect_toast               ... FAILED (test process exited with 
exit code 2)       16 ms
      equivclass                   ... FAILED (test process exited with 
exit code 2)       15 ms
parallel group (6 tests):  json json_encoding jsonb jsonpath_encoding 
jsonb_jsonpath jsonpath
      json                         ... FAILED (test process exited with 
exit code 2)        4 ms
      jsonb                        ... FAILED (test process exited with 
exit code 2)        6 ms
      json_encoding                ... FAILED (test process exited with 
exit code 2)        5 ms
      jsonpath                     ... FAILED (test process exited with 
exit code 2)       10 ms
      jsonpath_encoding            ... FAILED (test process exited with 
exit code 2)        6 ms
      jsonb_jsonpath               ... FAILED (test process exited with 
exit code 2)        7 ms
parallel group (19 tests):  plpgsql limit rowtypes sequence largeobject 
returning domain polymorphism plancache prepare alter_table truncate 
temp rangefuncs with copy2 conversion schema_variables xml
      plancache                    ... FAILED (test process exited with 
exit code 2)       16 ms
      limit                        ... FAILED (test process exited with 
exit code 2)       10 ms
      plpgsql                      ... FAILED (test process exited with 
exit code 2)        7 ms
      copy2                        ... FAILED (test process exited with 
exit code 2)       25 ms
      temp                         ... FAILED (test process exited with 
exit code 2)       21 ms
      domain                       ... FAILED (test process exited with 
exit code 2)       13 ms
      rangefuncs                   ... FAILED (test process exited with 
exit code 2)       22 ms
      prepare                      ... FAILED (test process exited with 
exit code 2)       19 ms
      conversion                   ... FAILED (test process exited with 
exit code 2)       24 ms
      truncate                     ... FAILED (test process exited with 
exit code 2)       19 ms
      alter_table                  ... FAILED (test process exited with 
exit code 2)       18 ms
      sequence                     ... FAILED (test process exited with 
exit code 2)       11 ms
      polymorphism                 ... FAILED (test process exited with 
exit code 2)       13 ms
      rowtypes                     ... FAILED (test process exited with 
exit code 2)       10 ms
      returning                    ... FAILED (test process exited with 
exit code 2)       11 ms
      largeobject                  ... FAILED (test process exited with 
exit code 2)       10 ms
      with                         ... FAILED (test process exited with 
exit code 2)       22 ms
      xml                          ... FAILED (test process exited with 
exit code 2)       23 ms
      schema_variables             ... FAILED (test process exited with 
exit code 2)       23 ms
parallel group (11 tests):  explain hash_part partition_info reloptions 
memoize compression partition_aggregate partition_join indexing 
partition_prune tuplesort
      partition_join               ... FAILED      902 ms
      partition_prune              ... ok         1006 ms
      reloptions                   ... ok          106 ms
      hash_part                    ... ok           99 ms
      indexing                     ... ok          929 ms
      partition_aggregate          ... ok          791 ms
      partition_info               ... ok          104 ms
      tuplesort                    ... ok         1099 ms
      explain                      ... ok           90 ms
      compression                  ... ok          214 ms
      memoize                      ... ok          109 ms
parallel group (2 tests):  event_trigger oidjoins
      event_trigger                ... ok          107 ms
      oidjoins                     ... ok          157 ms
test fast_default                 ... ok          138 ms
test stats                        ... ok          617 ms



On 9/9/21 6:59 AM, Pavel Stehule wrote:
> Hi
> 
> fresh rebase
> 
> Regards
> 
> Pavel
> 



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

čt 9. 9. 2021 v 12:21 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
 > [schema-variables-20210909.patch]

Hi Pavel,

The patch applies and compiles fine but 'make check' for the
assert-enabled fails on 131 out of 210 tests.

This morning I tested it. I'll recheck it.

Pavel


(while compiling HEAD checks run without errors for both assert-disabled
and assert-enabled)


Erik Rijkers


test tablespace                   ... ok          303 ms
parallel group (20 tests):  oid char pg_lsn int2 varchar txid int4
regproc uuid float4 text name money boolean bit float8 int8 enum numeric
rangetypes
      boolean                      ... ok          112 ms
      char                         ... ok           57 ms
      name                         ... ok          106 ms
      varchar                      ... ok           74 ms
      text                         ... ok          106 ms
      int2                         ... ok           73 ms
      int4                         ... ok           92 ms
      int8                         ... ok          130 ms
      oid                          ... ok           55 ms
      float4                       ... ok          102 ms
      float8                       ... ok          126 ms
      bit                          ... ok          124 ms
      numeric                      ... ok          362 ms
      txid                         ... ok           87 ms
      uuid                         ... ok          100 ms
      enum                         ... ok          142 ms
      money                        ... ok          109 ms
      rangetypes                   ... ok          433 ms
      pg_lsn                       ... ok           64 ms
      regproc                      ... ok           91 ms
parallel group (20 tests):  lseg path circle time macaddr
create_function_0 timetz line macaddr8 numerology point interval inet
date strings polygon box multirangetypes timestamp timestamptz
      strings                      ... ok          166 ms
      numerology                   ... ok           89 ms
      point                        ... ok           96 ms
      lseg                         ... ok           35 ms
      line                         ... ok           70 ms
      box                          ... ok          255 ms
      path                         ... ok           50 ms
      polygon                      ... ok          237 ms
      circle                       ... ok           53 ms
      date                         ... ok          127 ms
      time                         ... ok           60 ms
      timetz                       ... ok           67 ms
      timestamp                    ... ok          379 ms
      timestamptz                  ... ok          413 ms
      interval                     ... ok           97 ms
      inet                         ... ok          118 ms
      macaddr                      ... ok           60 ms
      macaddr8                     ... ok           80 ms
      multirangetypes              ... ok          307 ms
      create_function_0            ... ok           63 ms
parallel group (12 tests):  comments unicode misc_sanity tstypes xid
expressions horology geometry mvcc type_sanity regex opr_sanity
      geometry                     ... ok          140 ms
      horology                     ... ok          120 ms
      tstypes                      ... ok           53 ms
      regex                        ... ok          335 ms
      type_sanity                  ... ok          155 ms
      opr_sanity                   ... ok          355 ms
      misc_sanity                  ... ok           43 ms
      comments                     ... ok           20 ms
      expressions                  ... ok          100 ms
      unicode                      ... ok           25 ms
      xid                          ... ok           56 ms
      mvcc                         ... ok          146 ms
test create_function_1            ... ok           10 ms
test create_type                  ... ok           30 ms
test create_table                 ... ok          333 ms
test create_function_2            ... ok           11 ms
parallel group (5 tests):  copydml copyselect insert_conflict insert copy
      copy                         ... ok          336 ms
      copyselect                   ... ok           34 ms
      copydml                      ... ok           28 ms
      insert                       ... ok          291 ms
      insert_conflict              ... FAILED (test process exited with
exit code 2)      239 ms
parallel group (3 tests):  create_operator create_procedure create_misc
      create_misc                  ... ok          131 ms
      create_operator              ... ok           29 ms
      create_procedure             ... ok           52 ms
parallel group (5 tests):  create_view create_index_spgist
index_including create_index index_including_gist
      create_index                 ... FAILED (test process exited with
exit code 2)     3801 ms
      create_index_spgist          ... ok          523 ms
      create_view                  ... FAILED (test process exited with
exit code 2)      339 ms
      index_including              ... FAILED (test process exited with
exit code 2)     3801 ms
      index_including_gist         ... FAILED (test process exited with
exit code 2)     3801 ms
parallel group (16 tests):  create_aggregate create_cast typed_table
drop_if_exists roleattributes create_am hash_func updatable_views errors
infinite_recurse create_function_3 triggers constraints select inherit
vacuum
      create_aggregate             ... FAILED (test process exited with
exit code 2)      164 ms
      create_function_3            ... FAILED (test process exited with
exit code 2)      164 ms
      create_cast                  ... FAILED (test process exited with
exit code 2)      164 ms
      constraints                  ... FAILED (test process exited with
exit code 2)      181 ms
      triggers                     ... FAILED (test process exited with
exit code 2)      181 ms
      select                       ... FAILED (test process exited with
exit code 2)      181 ms
      inherit                      ... FAILED (test process exited with
exit code 2)      181 ms
      typed_table                  ... FAILED (test process exited with
exit code 2)      163 ms
      vacuum                       ... FAILED (test process exited with
exit code 2)      180 ms
      drop_if_exists               ... FAILED (test process exited with
exit code 2)      163 ms
      updatable_views              ... FAILED (test process exited with
exit code 2)      163 ms
      roleattributes               ... FAILED (test process exited with
exit code 2)      163 ms
      create_am                    ... FAILED (test process exited with
exit code 2)      163 ms
      hash_func                    ... FAILED (test process exited with
exit code 2)      162 ms
      errors                       ... FAILED (test process exited with
exit code 2)      162 ms
      infinite_recurse             ... FAILED (test process exited with
exit code 2)      162 ms
test sanity_check                 ... FAILED (test process exited with
exit code 2)       26 ms
parallel group (20 tests):  select_into subselect select_distinct arrays
join namespace hash_index select_having portals transactions aggregates
random update delete union btree_index select_implicit
select_distinct_on prepared_xacts case
      select_into                  ... FAILED (test process exited with
exit code 2)       20 ms
      select_distinct              ... FAILED (test process exited with
exit code 2)       21 ms
      select_distinct_on           ... FAILED (test process exited with
exit code 2)       26 ms
      select_implicit              ... FAILED (test process exited with
exit code 2)       26 ms
      select_having                ... FAILED (test process exited with
exit code 2)       23 ms
      subselect                    ... FAILED (test process exited with
exit code 2)       20 ms
      union                        ... FAILED (test process exited with
exit code 2)       25 ms
      case                         ... FAILED (test process exited with
exit code 2)       27 ms
      join                         ... FAILED (test process exited with
exit code 2)       22 ms
      aggregates                   ... FAILED (test process exited with
exit code 2)       24 ms
      transactions                 ... FAILED (test process exited with
exit code 2)       24 ms
      random                       ... failed (ignored) (test process
exited with exit code 2)       24 ms
      portals                      ... FAILED (test process exited with
exit code 2)       23 ms
      arrays                       ... FAILED (test process exited with
exit code 2)       20 ms
      btree_index                  ... FAILED (test process exited with
exit code 2)       25 ms
      hash_index                   ... FAILED (test process exited with
exit code 2)       22 ms
      update                       ... FAILED (test process exited with
exit code 2)       23 ms
      delete                       ... FAILED (test process exited with
exit code 2)       24 ms
      namespace                    ... FAILED (test process exited with
exit code 2)       21 ms
      prepared_xacts               ... FAILED (test process exited with
exit code 2)       25 ms
parallel group (20 tests):  gist brin identity generated password
tablesample lock matview replica_identity rowsecurity security_label
object_address drop_operator groupingsets join_hash privileges collate
init_privs spgist gin
      brin                         ... FAILED (test process exited with
exit code 2)       15 ms
      gin                          ... FAILED (test process exited with
exit code 2)       22 ms
      gist                         ... FAILED (test process exited with
exit code 2)       13 ms
      spgist                       ... FAILED (test process exited with
exit code 2)       22 ms
      privileges                   ... FAILED (test process exited with
exit code 2)       19 ms
      init_privs                   ... FAILED (test process exited with
exit code 2)       21 ms
      security_label               ... FAILED (test process exited with
exit code 2)       17 ms
      collate                      ... FAILED (test process exited with
exit code 2)       20 ms
      matview                      ... FAILED (test process exited with
exit code 2)       17 ms
      lock                         ... FAILED (test process exited with
exit code 2)       15 ms
      replica_identity             ... FAILED (test process exited with
exit code 2)       17 ms
      rowsecurity                  ... FAILED (test process exited with
exit code 2)       17 ms
      object_address               ... FAILED (test process exited with
exit code 2)       17 ms
      tablesample                  ... FAILED (test process exited with
exit code 2)       15 ms
      groupingsets                 ... FAILED (test process exited with
exit code 2)       17 ms
      drop_operator                ... FAILED (test process exited with
exit code 2)       17 ms
      password                     ... FAILED (test process exited with
exit code 2)       14 ms
      identity                     ... FAILED (test process exited with
exit code 2)       13 ms
      generated                    ... FAILED (test process exited with
exit code 2)       14 ms
      join_hash                    ... FAILED (test process exited with
exit code 2)       18 ms
parallel group (2 tests):  brin_multi brin_bloom
      brin_bloom                   ... FAILED (test process exited with
exit code 2)        4 ms
      brin_multi                   ... FAILED (test process exited with
exit code 2)        4 ms
parallel group (14 tests):  async create_table_like collate.icu.utf8
misc sysviews alter_operator tidscan tidrangescan alter_generic tsrf
incremental_sort misc_functions tid dbsize
      create_table_like            ... FAILED (test process exited with
exit code 2)       10 ms
      alter_generic                ... FAILED (test process exited with
exit code 2)       15 ms
      alter_operator               ... FAILED (test process exited with
exit code 2)       13 ms
      misc                         ... FAILED (test process exited with
exit code 2)       11 ms
      async                        ... FAILED (test process exited with
exit code 2)        9 ms
      dbsize                       ... FAILED (test process exited with
exit code 2)       17 ms
      misc_functions               ... FAILED (test process exited with
exit code 2)       14 ms
      sysviews                     ... FAILED (test process exited with
exit code 2)       12 ms
      tsrf                         ... FAILED (test process exited with
exit code 2)       14 ms
      tid                          ... FAILED (test process exited with
exit code 2)       16 ms
      tidscan                      ... FAILED (test process exited with
exit code 2)       13 ms
      tidrangescan                 ... FAILED (test process exited with
exit code 2)       13 ms
      collate.icu.utf8             ... FAILED (test process exited with
exit code 2)        9 ms
      incremental_sort             ... FAILED (test process exited with
exit code 2)       13 ms
parallel group (6 tests):  amutils psql_crosstab collate.linux.utf8 psql
stats_ext rules
      rules                        ... FAILED (test process exited with
exit code 2)        8 ms
      psql                         ... FAILED (test process exited with
exit code 2)        7 ms
      psql_crosstab                ... FAILED (test process exited with
exit code 2)        7 ms
      amutils                      ... FAILED (test process exited with
exit code 2)        7 ms
      stats_ext                    ... FAILED (test process exited with
exit code 2)        8 ms
      collate.linux.utf8           ... FAILED (test process exited with
exit code 2)        7 ms
test select_parallel              ... FAILED (test process exited with
exit code 2)        3 ms
test write_parallel               ... FAILED (test process exited with
exit code 2)        3 ms
parallel group (2 tests):  publication subscription
      publication                  ... FAILED (test process exited with
exit code 2)        4 ms
      subscription                 ... FAILED (test process exited with
exit code 2)        4 ms
parallel group (17 tests):  select_views foreign_key xmlmap window
functional_deps tsearch cluster combocid bitmapops tsdicts equivclass
guc indirect_toast advisory_lock foreign_data dependency portals_p2
      select_views                 ... FAILED (test process exited with
exit code 2)       11 ms
      portals_p2                   ... FAILED (test process exited with
exit code 2)       20 ms
      foreign_key                  ... FAILED (test process exited with
exit code 2)       11 ms
      cluster                      ... FAILED (test process exited with
exit code 2)       15 ms
      dependency                   ... FAILED (test process exited with
exit code 2)       19 ms
      guc                          ... FAILED (test process exited with
exit code 2)       16 ms
      bitmapops                    ... FAILED (test process exited with
exit code 2)       16 ms
      combocid                     ... FAILED (test process exited with
exit code 2)       15 ms
      tsearch                      ... FAILED (test process exited with
exit code 2)       14 ms
      tsdicts                      ... FAILED (test process exited with
exit code 2)       16 ms
      foreign_data                 ... FAILED (test process exited with
exit code 2)       17 ms
      window                       ... FAILED (test process exited with
exit code 2)       12 ms
      xmlmap                       ... FAILED (test process exited with
exit code 2)       12 ms
      functional_deps              ... FAILED (test process exited with
exit code 2)       13 ms
      advisory_lock                ... FAILED (test process exited with
exit code 2)       16 ms
      indirect_toast               ... FAILED (test process exited with
exit code 2)       16 ms
      equivclass                   ... FAILED (test process exited with
exit code 2)       15 ms
parallel group (6 tests):  json json_encoding jsonb jsonpath_encoding
jsonb_jsonpath jsonpath
      json                         ... FAILED (test process exited with
exit code 2)        4 ms
      jsonb                        ... FAILED (test process exited with
exit code 2)        6 ms
      json_encoding                ... FAILED (test process exited with
exit code 2)        5 ms
      jsonpath                     ... FAILED (test process exited with
exit code 2)       10 ms
      jsonpath_encoding            ... FAILED (test process exited with
exit code 2)        6 ms
      jsonb_jsonpath               ... FAILED (test process exited with
exit code 2)        7 ms
parallel group (19 tests):  plpgsql limit rowtypes sequence largeobject
returning domain polymorphism plancache prepare alter_table truncate
temp rangefuncs with copy2 conversion schema_variables xml
      plancache                    ... FAILED (test process exited with
exit code 2)       16 ms
      limit                        ... FAILED (test process exited with
exit code 2)       10 ms
      plpgsql                      ... FAILED (test process exited with
exit code 2)        7 ms
      copy2                        ... FAILED (test process exited with
exit code 2)       25 ms
      temp                         ... FAILED (test process exited with
exit code 2)       21 ms
      domain                       ... FAILED (test process exited with
exit code 2)       13 ms
      rangefuncs                   ... FAILED (test process exited with
exit code 2)       22 ms
      prepare                      ... FAILED (test process exited with
exit code 2)       19 ms
      conversion                   ... FAILED (test process exited with
exit code 2)       24 ms
      truncate                     ... FAILED (test process exited with
exit code 2)       19 ms
      alter_table                  ... FAILED (test process exited with
exit code 2)       18 ms
      sequence                     ... FAILED (test process exited with
exit code 2)       11 ms
      polymorphism                 ... FAILED (test process exited with
exit code 2)       13 ms
      rowtypes                     ... FAILED (test process exited with
exit code 2)       10 ms
      returning                    ... FAILED (test process exited with
exit code 2)       11 ms
      largeobject                  ... FAILED (test process exited with
exit code 2)       10 ms
      with                         ... FAILED (test process exited with
exit code 2)       22 ms
      xml                          ... FAILED (test process exited with
exit code 2)       23 ms
      schema_variables             ... FAILED (test process exited with
exit code 2)       23 ms
parallel group (11 tests):  explain hash_part partition_info reloptions
memoize compression partition_aggregate partition_join indexing
partition_prune tuplesort
      partition_join               ... FAILED      902 ms
      partition_prune              ... ok         1006 ms
      reloptions                   ... ok          106 ms
      hash_part                    ... ok           99 ms
      indexing                     ... ok          929 ms
      partition_aggregate          ... ok          791 ms
      partition_info               ... ok          104 ms
      tuplesort                    ... ok         1099 ms
      explain                      ... ok           90 ms
      compression                  ... ok          214 ms
      memoize                      ... ok          109 ms
parallel group (2 tests):  event_trigger oidjoins
      event_trigger                ... ok          107 ms
      oidjoins                     ... ok          157 ms
test fast_default                 ... ok          138 ms
test stats                        ... ok          617 ms



On 9/9/21 6:59 AM, Pavel Stehule wrote:
> Hi
>
> fresh rebase
>
> Regards
>
> Pavel
>

Re: Schema variables - new implementation for Postgres 15

From
Gilles Darold
Date:
Le 09/09/2021 à 11:40, Pavel Stehule a écrit :
Hi

čt 9. 9. 2021 v 12:21 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
 > [schema-variables-20210909.patch]

Hi Pavel,

The patch applies and compiles fine but 'make check' for the
assert-enabled fails on 131 out of 210 tests.

This morning I tested it. I'll recheck it.

Pavel


I had not this problem yesterday.


-- 
Gilles Darold
http://www.darold.net/

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 9. 9. 2021 v 13:17 odesílatel Gilles Darold <gilles@darold.net> napsal:
Le 09/09/2021 à 11:40, Pavel Stehule a écrit :
Hi

čt 9. 9. 2021 v 12:21 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
 > [schema-variables-20210909.patch]

Hi Pavel,

The patch applies and compiles fine but 'make check' for the
assert-enabled fails on 131 out of 210 tests.

This morning I tested it. I'll recheck it.

Pavel


I had not this problem yesterday.


I am able to reproduce it. Looks like some current changes of Nodes don't work with this patch. I have to investigate it.

Regards

Pavel


-- 
Gilles Darold
http://www.darold.net/

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

čt 9. 9. 2021 v 12:21 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
 > [schema-variables-20210909.patch]

Hi Pavel,

The patch applies and compiles fine but 'make check' for the
assert-enabled fails on 131 out of 210 tests.

(while compiling HEAD checks run without errors for both assert-disabled
and assert-enabled)



Please, check, attached patch. I fixed a routine for processing a list of identifiers - now it works with the identifier's node more sensitive. Previous implementation of strVal was more tolerant.

Regards

Pavel


Attachment

Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
On 9/10/21 10:06 AM, Pavel Stehule wrote:
> Hi
> 
> čt 9. 9. 2021 v 12:21 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
> 
>>
>> Hi Pavel,
>>
>> The patch applies and compiles fine but 'make check' for the
>> assert-enabled fails on 131 out of 210 tests.
>>
>> (while compiling HEAD checks run without errors for both assert-disabled
>> and assert-enabled)
>>
>>
> 
> Please, check, attached patch. I fixed a routine for processing a list of
> identifiers - now it works with the identifier's node more sensitive.
> Previous implementation of strVal was more tolerant.

 > [schema-variables-20210910.patch]

Apply, compile, make, & check(-world), and my small testsuite OK.

So all's well again - Ready for committer!

Thanks,

Erik Rijkers


> Regards
> 
> Pavel
> 
> 
>>
> 



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 10. 9. 2021 v 10:32 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
On 9/10/21 10:06 AM, Pavel Stehule wrote:
> Hi
>
> čt 9. 9. 2021 v 12:21 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
>
>>
>> Hi Pavel,
>>
>> The patch applies and compiles fine but 'make check' for the
>> assert-enabled fails on 131 out of 210 tests.
>>
>> (while compiling HEAD checks run without errors for both assert-disabled
>> and assert-enabled)
>>
>>
>
> Please, check, attached patch. I fixed a routine for processing a list of
> identifiers - now it works with the identifier's node more sensitive.
> Previous implementation of strVal was more tolerant.

 > [schema-variables-20210910.patch]

Apply, compile, make, & check(-world), and my small testsuite OK.

So all's well again - Ready for committer!

Thank you for check and for report

Regards

Pavel


Thanks,

Erik Rijkers


> Regards
>
> Pavel
>
>
>>
>

Re: Schema variables - new implementation for Postgres 15

From
Jaime Casanova
Date:
On Fri, Sep 10, 2021 at 10:06:04AM +0200, Pavel Stehule wrote:
> Hi
> 
> čt 9. 9. 2021 v 12:21 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
> 
> >  > [schema-variables-20210909.patch]
> >
> > Hi Pavel,
> >
> > The patch applies and compiles fine but 'make check' for the
> > assert-enabled fails on 131 out of 210 tests.
> >
> > (while compiling HEAD checks run without errors for both assert-disabled
> > and assert-enabled)
> >
> >
> 
> Please, check, attached patch. I fixed a routine for processing a list of
> identifiers - now it works with the identifier's node more sensitive.
> Previous implementation of strVal was more tolerant.
> 

Hi Pavel,

Just noted that there is no support for REASSIGN OWNED BY:

"""
regression=# create variable random_number numeric;
CREATE VARIABLE
regression=# alter variable random_number owner to jcm;
ALTER VARIABLE
regression=# reassign owned by jcm to jaime;
ERROR:  unexpected classid 9222
"""


TEMP variables are not schema variables? at least not attached to the
schema one expects:

"""
regression=# create temp variable random_number numeric ;
CREATE VARIABLE
regression=# \dV
                                               List of variables
  Schema   |     Name      |  Type   | Is nullable | Is mutable | Default |  Owner   | Transaction
al end action
-----------+---------------+---------+-------------+------------+---------+----------+------------
--------------
 pg_temp_4 | random_number | numeric | t           | t          |         | jcasanov |
(1 row)

regression=# select public.random_number;
ERROR:  missing FROM-clause entry for table "public"
LINE 1: select public.random_number;
               ^
"""

There was a comment that TEMP variables should be DECLAREd instead of
CREATEd, i guess that is because those have similar behaviour. At least,
I would like to see similar messages when using the ON COMMIT DROP
option in a TEMP variable:

"""
regression=# create temp variable random_number numeric on commit drop;
CREATE VARIABLE
regression=# \dV
Did not find any schema variables.
regression=# declare q cursor  for select 1;
ERROR:  DECLARE CURSOR can only be used in transaction blocks
"""

About that, why are you not using syntax ON COMMIT RESET instead on
inventing ON TRANSACTION END RESET? seems better because you already use
ON COMMIT DROP.

I will test more this patch tomorrow. Great work, very complete.

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi



Just noted that there is no support for REASSIGN OWNED BY:

"""
regression=# create variable random_number numeric;
CREATE VARIABLE
regression=# alter variable random_number owner to jcm;
ALTER VARIABLE
regression=# reassign owned by jcm to jaime;
ERROR:  unexpected classid 9222
"""


should be fixed by the attached patch, please check.



TEMP variables are not schema variables? at least not attached to the
schema one expects:

temp variables are schema variables like any other. But they are created in temp schema - like temp tables.
I designed it in consistency with temporary tables.


"""
regression=# create temp variable random_number numeric ;
CREATE VARIABLE
regression=# \dV
                                               List of variables
  Schema   |     Name      |  Type   | Is nullable | Is mutable | Default |  Owner   | Transaction
al end action
-----------+---------------+---------+-------------+------------+---------+----------+------------
--------------
 pg_temp_4 | random_number | numeric | t           | t          |         | jcasanov |
(1 row)

regression=# select public.random_number;
ERROR:  missing FROM-clause entry for table "public"
LINE 1: select public.random_number;
               ^
"""

There was a comment that TEMP variables should be DECLAREd instead of
CREATEd, i guess that is because those have similar behaviour. At least,
I would like to see similar messages when using the ON COMMIT DROP
option in a TEMP variable:

I don't remember this comment. When I talked about similarity with the DECLARE statement, I thought about semantic similarity with T-SQL (Microsoft SQL) DECLARE command. Unfortunately, DECLARE command is pretty messy - it exists in SQL, it exists in SQL/PSM and it exists in T-SQL - and every time has similar syntax, but partially different semantics. For me - CREATE TEMP VARIABLE creates session's life limited variable (by default), similarly like DECLARE @localvariable command from T-SQL.


"""
regression=# create temp variable random_number numeric on commit drop;
CREATE VARIABLE
regression=# \dV
Did not find any schema variables.
regression=# declare q cursor  for select 1;
ERROR:  DECLARE CURSOR can only be used in transaction blocks
"""

I have different result

postgres=# create temp variable random_number numeric on commit drop;
CREATE VARIABLE
postgres=# \dV
                                             List of variables
┌────────┬───────────────┬─────────┬─────────────┬────────────┬─────────┬───────┬──────────────────────────┐
│ Schema │     Name      │  Type   │ Is nullable │ Is mutable │ Default │ Owner │ Transactional end action │
╞════════╪═══════════════╪═════════╪═════════════╪════════════╪═════════╪═══════╪══════════════════════════╡
│ public │ random_number │ numeric │ t           │ t          │         │ tom2  │                          │
└────────┴───────────────┴─────────┴─────────────┴────────────┴─────────┴───────┴──────────────────────────┘
(1 row)



About that, why are you not using syntax ON COMMIT RESET instead on
inventing ON TRANSACTION END RESET? seems better because you already use
ON COMMIT DROP.

I thought about this question for a very long time, and I think the introduction of a new clause is better, and I will try to explain why.

One part of this patch are DDL statements - and all DDL statements are consistent with other DDL statements in Postgres. Schema variables DDL commands are transactional and for TEMP variables we can specify a scope - session or transaction, and then clause ON COMMIT DROP is used. You should not need to specify ON ROLLBACK action, because in this case an removing from system catalogue is only one possible action.

Second part of this patch is holding some value in schema variables or initialization with default expression. The default behaviour is not transactional, and the value is stored all session's time by default. But I think it can be very useful to enforce initialization in some specific times - now only the end of the transaction is possible to specify. In the future there can be transaction end, transaction start, rollback, commit, top query start, top query end, ... This logic is different from the logic of DDL commands.  For DDL commands I need to specify behaviour just for the COMMIT end. But for reset of non-transactional schema variables I need to specify any possible end of transaction - COMMIT, ROLLBACK or COMMIT or ROLLBACK. In this initial version I implemented "ON COMMIT OR ROLLBACK RESET", and although it is clean I think it is more readable is the clause that I invented "ON TRANSACTION END". "ON COMMIT RESET" is not exact. "ON COMMIT OR ROLLBACK RESET" sounds a little bit strange for me, but we use something similar in trigger definition "ON INSERT OR UPDATE OR DELETE ..." My opinion is not too strong if "ON TRANSACTION END  RESET"  or "ON COMMIT OR ROLLBACK RESET" is better, and I can change it if people will have different preferences, but I am sure so "ON COMMIT RESET" is not correct in implemented case. And from the perspective of a PLpgSQL developer, I would have initialized the variable on any transaction start, so I need to reset it on any end.

Regards

Pavel



I will test more this patch tomorrow. Great work, very complete.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 12. 9. 2021 v 17:38 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi



Just noted that there is no support for REASSIGN OWNED BY:

"""
regression=# create variable random_number numeric;
CREATE VARIABLE
regression=# alter variable random_number owner to jcm;
ALTER VARIABLE
regression=# reassign owned by jcm to jaime;
ERROR:  unexpected classid 9222
"""


should be fixed by the attached patch, please check.



TEMP variables are not schema variables? at least not attached to the
schema one expects:

temp variables are schema variables like any other. But they are created in temp schema - like temp tables.
I designed it in consistency with temporary tables.


"""
regression=# create temp variable random_number numeric ;
CREATE VARIABLE
regression=# \dV
                                               List of variables
  Schema   |     Name      |  Type   | Is nullable | Is mutable | Default |  Owner   | Transaction
al end action
-----------+---------------+---------+-------------+------------+---------+----------+------------
--------------
 pg_temp_4 | random_number | numeric | t           | t          |         | jcasanov |
(1 row)

regression=# select public.random_number;
ERROR:  missing FROM-clause entry for table "public"
LINE 1: select public.random_number;
               ^
"""

There was a comment that TEMP variables should be DECLAREd instead of
CREATEd, i guess that is because those have similar behaviour. At least,
I would like to see similar messages when using the ON COMMIT DROP
option in a TEMP variable:

I don't remember this comment. When I talked about similarity with the DECLARE statement, I thought about semantic similarity with T-SQL (Microsoft SQL) DECLARE command. Unfortunately, DECLARE command is pretty messy - it exists in SQL, it exists in SQL/PSM and it exists in T-SQL - and every time has similar syntax, but partially different semantics. For me - CREATE TEMP VARIABLE creates session's life limited variable (by default), similarly like DECLARE @localvariable command from T-SQL.

any value of a schema variable has a session (or transaction) life cycle. But the schema variable itself is persistent.  temp schema variable is an exception. It is limited by session (and the value stored in the variable is limited to session too).



"""
regression=# create temp variable random_number numeric on commit drop;
CREATE VARIABLE
regression=# \dV
Did not find any schema variables.
regression=# declare q cursor  for select 1;
ERROR:  DECLARE CURSOR can only be used in transaction blocks
"""

I have different result

postgres=# create temp variable random_number numeric on commit drop;
CREATE VARIABLE
postgres=# \dV
                                             List of variables
┌────────┬───────────────┬─────────┬─────────────┬────────────┬─────────┬───────┬──────────────────────────┐
│ Schema │     Name      │  Type   │ Is nullable │ Is mutable │ Default │ Owner │ Transactional end action │
╞════════╪═══════════════╪═════════╪═════════════╪════════════╪═════════╪═══════╪══════════════════════════╡
│ public │ random_number │ numeric │ t           │ t          │         │ tom2  │                          │
└────────┴───────────────┴─────────┴─────────────┴────────────┴─────────┴───────┴──────────────────────────┘
(1 row)



About that, why are you not using syntax ON COMMIT RESET instead on
inventing ON TRANSACTION END RESET? seems better because you already use
ON COMMIT DROP.

I thought about this question for a very long time, and I think the introduction of a new clause is better, and I will try to explain why.

One part of this patch are DDL statements - and all DDL statements are consistent with other DDL statements in Postgres. Schema variables DDL commands are transactional and for TEMP variables we can specify a scope - session or transaction, and then clause ON COMMIT DROP is used. You should not need to specify ON ROLLBACK action, because in this case an removing from system catalogue is only one possible action.

Second part of this patch is holding some value in schema variables or initialization with default expression. The default behaviour is not transactional, and the value is stored all session's time by default. But I think it can be very useful to enforce initialization in some specific times - now only the end of the transaction is possible to specify. In the future there can be transaction end, transaction start, rollback, commit, top query start, top query end, ... This logic is different from the logic of DDL commands.  For DDL commands I need to specify behaviour just for the COMMIT end. But for reset of non-transactional schema variables I need to specify any possible end of transaction - COMMIT, ROLLBACK or COMMIT or ROLLBACK. In this initial version I implemented "ON COMMIT OR ROLLBACK RESET", and although it is clean I think it is more readable is the clause that I invented "ON TRANSACTION END". "ON COMMIT RESET" is not exact. "ON COMMIT OR ROLLBACK RESET" sounds a little bit strange for me, but we use something similar in trigger definition "ON INSERT OR UPDATE OR DELETE ..." My opinion is not too strong if "ON TRANSACTION END  RESET"  or "ON COMMIT OR ROLLBACK RESET" is better, and I can change it if people will have different preferences, but I am sure so "ON COMMIT RESET" is not correct in implemented case. And from the perspective of a PLpgSQL developer, I would have initialized the variable on any transaction start, so I need to reset it on any end.

Regards

Pavel



I will test more this patch tomorrow. Great work, very complete.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Re: Schema variables - new implementation for Postgres 15

From
Jaime Casanova
Date:
On Sun, Sep 12, 2021 at 05:38:42PM +0200, Pavel Stehule wrote:
> Hi
> 
> > """
> > regression=# create temp variable random_number numeric on commit drop;
> > CREATE VARIABLE
> > regression=# \dV
> > Did not find any schema variables.
> > regression=# declare q cursor  for select 1;
> > ERROR:  DECLARE CURSOR can only be used in transaction blocks
> > """
> >
> 
> I have different result
> 
> postgres=# create temp variable random_number numeric on commit drop;
> CREATE VARIABLE
> postgres=# \dV
>                                              List of variables
> ┌────────┬───────────────┬─────────┬─────────────┬────────────┬─────────┬───────┬──────────────────────────┐
> │ Schema │     Name      │  Type   │ Is nullable │ Is mutable │ Default │
> Owner │ Transactional end action │
> ╞════════╪═══════════════╪═════════╪═════════════╪════════════╪═════════╪═══════╪══════════════════════════╡
> │ public │ random_number │ numeric │ t           │ t          │         │
> tom2  │                          │
> └────────┴───────────────┴─────────┴─────────────┴────────────┴─────────┴───────┴──────────────────────────┘
> (1 row)
> 
> 
> 

Hi, 

Thanks, will test rebased version.
BTW, that is not the temp variable. You can note it because of the
schema or the lack of a "Transaction end action". That is a normal
non-temp variable that has been created before. A TEMP variable with an
ON COMMIT DROP created outside an explicit transaction will disappear
immediatly like cursor does in the same situation.


-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:



Hi,

Thanks, will test rebased version.
BTW, that is not the temp variable. You can note it because of the
schema or the lack of a "Transaction end action". That is a normal
non-temp variable that has been created before. A TEMP variable with an
ON COMMIT DROP created outside an explicit transaction will disappear
immediatly like cursor does in the same situation.

Unfortunately, I don't see it - or I don't understand to your example from morning mail well

"""
regression=# create temp variable random_number numeric ;
CREATE VARIABLE
regression=# \dV
                                               List of variables
  Schema   |     Name      |  Type   | Is nullable | Is mutable | Default |  Owner   | Transaction
al end action
-----------+---------------+---------+-------------+------------+---------+----------+------------
--------------
 pg_temp_4 | random_number | numeric | t           | t          |         | jcasanov |
(1 row)

regression=# select public.random_number;
ERROR:  missing FROM-clause entry for table "public"
LINE 1: select public.random_number;
               ^
"""
 


--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

just rebase of patch

Regards

Pavel

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

only rebase

Regards

Pavel

čt 16. 9. 2021 v 7:15 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

just rebase of patch

Regards

Pavel

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Tomas Vondra
Date:
Hi,

I took a quick look at the latest patch version. In general the patch
looks pretty complete and clean, and for now I have only some basic
comments. The attached patch tweaks some of this, along with a couple
additional minor changes that I'll not discuss here.


1) Not sure why we need to call this "schema variables". Most objects
are placed in a schema, and we don't say "schema tables" for example.
And it's CREATE VARIABLE and not CREATE SCHEMA VARIABLE, so it's a bit
inconsistent.

The docs actually use "Global variables" in one place for some reason.


2) I find this a bit confusing:

SELECT non_existent_variable;
test=# select s;
ERROR:  column "non_existent_variable" does not exist
LINE 1: select non_existent_variable;

I wonder if this means using SELECT to read variables is a bad idea, and
we should have a separate command, just like we have LET (instead of
just using UPDATE in some way).


3) I've reworded / tweaked a couple places in the docs, but this really
needs a native speaker - I don't have a very good "feeling" for this
technical language so it's probably still quite cumbersome.


4) Is sequential scan of the hash table  in clean_cache_callback() a
good idea? I wonder how fast (with how many variables) it'll become
noticeable, but it may be good enough for now and we can add something
better (tracing which variables need resetting) later.


5) In what situation would we call clean_cache_callback() without a
transaction state? If that happens it seems more like a bug, so
maybeelog(ERROR) or Assert() would be more appropriate?


6) free_schema_variable does not actually use the force parameter


7) The target_exprkind expression in transformSelectStmt really needs
some explanation. Because that's chance you'll look at this in 6 months
and understand what it does?

    target_exprkind =
        (pstate->p_expr_kind != EXPR_KIND_LET_TARGET ||
         pstate->parentParseState != NULL) ?
                    EXPR_KIND_SELECT_TARGET : EXPR_KIND_LET_TARGET;


8) immutable variables without a default value

IMO this case should not be allowed. On 2021/08/29 you wrote:

    I thought about this case, and I have one scenario, where this
    behaviour can be useful. When the variable is declared as IMMUTABLE
    NOT NULL without not null default, then any access to the content of
    the variable has to fail. I think it can be used for detection,
    where and when the variable is first used. So this behavior is
    allowed just because I think, so this feature can be interesting for
    debugging. If this idea is too strange, I have no problem to disable
    this case.

This seems like a really strange use case. In a production code you'll
not do this, because then the variable is useless and the code does not
work at all (it'll just fail whenever it attempts to access the var).
And if you can modify the code, there are other / better ways to do this
(raising an exception, ...).

So this seems pretty useless to me, +1 to disabling it.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Tomas Vondra
Date:
FWIW the patch was marked as RFC for about a year, but there was plenty
of discussion / changes since then, so that seemed premature. I've
switched it back to WoA.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

st 3. 11. 2021 v 14:05 odesílatel Tomas Vondra <tomas.vondra@enterprisedb.com> napsal:
Hi,

I took a quick look at the latest patch version. In general the patch
looks pretty complete and clean, and for now I have only some basic
comments. The attached patch tweaks some of this, along with a couple
additional minor changes that I'll not discuss here.


1) Not sure why we need to call this "schema variables". Most objects
are placed in a schema, and we don't say "schema tables" for example.
And it's CREATE VARIABLE and not CREATE SCHEMA VARIABLE, so it's a bit
inconsistent.

Yes, there is inconsistency, but I think it is necessary. The name "variable" is too generic. Theoretically we can use other adjectives like session variables or global variables and the name will be valid. But it doesn't describe the fundamentals of design. This is similar to the package's variables from PL/SQL. These variables are global, session's variables too. But the usual name is "package variables". So schema variables are assigned to schemes, and I think a good name can be "schema variables". But it is not necessary to repeat keyword schema in the CREATE COMMAND.

My opinion is not too strong in this case, and I can accept just "variables" or "session's variables" or "global variables", but I am not sure if these names describe this feature well, because still they are too generic. There are too many different implementations of session global variables (see PL/SQL or T-SQL or DB2).


The docs actually use "Global variables" in one place for some reason.


2) I find this a bit confusing:

SELECT non_existent_variable;
test=# select s;
ERROR:  column "non_existent_variable" does not exist
LINE 1: select non_existent_variable;

I wonder if this means using SELECT to read variables is a bad idea, and
we should have a separate command, just like we have LET (instead of
just using UPDATE in some way).

I am sure so I want to use variables in SELECTs. One interesting case is using variables in RLS.

I prefer to fix this error message to "column or variable ... does not exist"
 


3) I've reworded / tweaked a couple places in the docs, but this really
needs a native speaker - I don't have a very good "feeling" for this
technical language so it's probably still quite cumbersome.


4) Is sequential scan of the hash table  in clean_cache_callback() a
good idea? I wonder how fast (with how many variables) it'll become
noticeable, but it may be good enough for now and we can add something
better (tracing which variables need resetting) later.


I have to check it.
 

5) In what situation would we call clean_cache_callback() without a
transaction state? If that happens it seems more like a bug, so
maybeelog(ERROR) or Assert() would be more appropriate?

 


6) free_schema_variable does not actually use the force parameter


7) The target_exprkind expression in transformSelectStmt really needs
some explanation. Because that's chance you'll look at this in 6 months
and understand what it does?

    target_exprkind =
        (pstate->p_expr_kind != EXPR_KIND_LET_TARGET ||
         pstate->parentParseState != NULL) ?
                    EXPR_KIND_SELECT_TARGET : EXPR_KIND_LET_TARGET;


8) immutable variables without a default value

IMO this case should not be allowed. On 2021/08/29 you wrote:

    I thought about this case, and I have one scenario, where this
    behaviour can be useful. When the variable is declared as IMMUTABLE
    NOT NULL without not null default, then any access to the content of
    the variable has to fail. I think it can be used for detection,
    where and when the variable is first used. So this behavior is
    allowed just because I think, so this feature can be interesting for
    debugging. If this idea is too strange, I have no problem to disable
    this case.

This seems like a really strange use case. In a production code you'll
not do this, because then the variable is useless and the code does not
work at all (it'll just fail whenever it attempts to access the var).
And if you can modify the code, there are other / better ways to do this
(raising an exception, ...).

So this seems pretty useless to me, +1 to disabling it.

I'll disable it.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Schema variables - new implementation for Postgres 15

From
Justin Pryzby
Date:
On Sat, Nov 06, 2021 at 04:45:19AM +0100, Pavel Stehule wrote:
> st 3. 11. 2021 v 14:05 odesílatel Tomas Vondra <tomas.vondra@enterprisedb.com> napsal:
> > 1) Not sure why we need to call this "schema variables". Most objects
> > are placed in a schema, and we don't say "schema tables" for example.
> > And it's CREATE VARIABLE and not CREATE SCHEMA VARIABLE, so it's a bit
> > inconsistent.

+1

At least the error messages need to be consistent.
It doesn't make sense to have both of these:

+               elog(ERROR, "cache lookup failed for schema variable %u", varid);
+               elog(ERROR, "cache lookup failed for variable %u", varid);

> Yes, there is inconsistency, but I think it is necessary. The name
> "variable" is too generic. Theoretically we can use other adjectives like
> session variables or global variables and the name will be valid. But it
> doesn't describe the fundamentals of design. This is similar to the
> package's variables from PL/SQL. These variables are global, session's
> variables too. But the usual name is "package variables". So schema
> variables are assigned to schemes, and I think a good name can be "schema
> variables". But it is not necessary to repeat keyword schema in the CREATE
> COMMAND.
> 
> My opinion is not too strong in this case, and I can accept just
> "variables" or "session's variables" or "global variables", but I am not
> sure if these names describe this feature well, because still they are too
> generic. There are too many different implementations of session global
> variables (see PL/SQL or T-SQL or DB2).

I would prefer "session variable".

To me, this feature seems similar to a CTE (which exists for a single
statement), or a temporary table (which exists for a single transaction).  So
"session" conveys a lot more of its meaning than "schema".

But don't rename everything just for me...

-- 
Justin



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 6. 11. 2021 v 15:57 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:
On Sat, Nov 06, 2021 at 04:45:19AM +0100, Pavel Stehule wrote:
> st 3. 11. 2021 v 14:05 odesílatel Tomas Vondra <tomas.vondra@enterprisedb.com> napsal:
> > 1) Not sure why we need to call this "schema variables". Most objects
> > are placed in a schema, and we don't say "schema tables" for example.
> > And it's CREATE VARIABLE and not CREATE SCHEMA VARIABLE, so it's a bit
> > inconsistent.

+1

At least the error messages need to be consistent.
It doesn't make sense to have both of these:

+               elog(ERROR, "cache lookup failed for schema variable %u", varid);
+               elog(ERROR, "cache lookup failed for variable %u", varid);

> Yes, there is inconsistency, but I think it is necessary. The name
> "variable" is too generic. Theoretically we can use other adjectives like
> session variables or global variables and the name will be valid. But it
> doesn't describe the fundamentals of design. This is similar to the
> package's variables from PL/SQL. These variables are global, session's
> variables too. But the usual name is "package variables". So schema
> variables are assigned to schemes, and I think a good name can be "schema
> variables". But it is not necessary to repeat keyword schema in the CREATE
> COMMAND.
>
> My opinion is not too strong in this case, and I can accept just
> "variables" or "session's variables" or "global variables", but I am not
> sure if these names describe this feature well, because still they are too
> generic. There are too many different implementations of session global
> variables (see PL/SQL or T-SQL or DB2).

I would prefer "session variable".

To me, this feature seems similar to a CTE (which exists for a single
statement), or a temporary table (which exists for a single transaction).  So
"session" conveys a lot more of its meaning than "schema".

It depends on where you are looking. There are two perspectives - data and metadata. And if I use data perspective, then it is session related. If I use metadata perspective, then it can be persistent or temporal like tables. I see strong similarity with Global Temporary Tables - but I think naming "local temporary tables" and "global temporary tables" can be not intuitive or messy for a lot of people too. Anyway, if people will try to find this feature on Google, then probably use keywords "session variables", so maybe my preference of more technical terminology is obscure and not practical, and the name "session variables" can be more practical for other people. If I use the system used for GTT - then the exact name can be "Global Session Variable". Can we use this name? Or shortly just Session Variables because we don't support local session variables now.

What do you think about it?

Regards

Pavel



But don't rename everything just for me...

--
Justin

Re: Schema variables - new implementation for Postgres 15

From
Tomas Vondra
Date:

On 11/6/21 16:40, Pavel Stehule wrote:
> 
> 
> so 6. 11. 2021 v 15:57 odesílatel Justin Pryzby <pryzby@telsasoft.com 
> <mailto:pryzby@telsasoft.com>> napsal:
> 
>     On Sat, Nov 06, 2021 at 04:45:19AM +0100, Pavel Stehule wrote:
>      > st 3. 11. 2021 v 14:05 odesílatel Tomas Vondra
>     <tomas.vondra@enterprisedb.com
>     <mailto:tomas.vondra@enterprisedb.com>> napsal:
>      > > 1) Not sure why we need to call this "schema variables". Most
>     objects
>      > > are placed in a schema, and we don't say "schema tables" for
>     example.
>      > > And it's CREATE VARIABLE and not CREATE SCHEMA VARIABLE, so
>     it's a bit
>      > > inconsistent.
> 
>     +1
> 
>     At least the error messages need to be consistent.
>     It doesn't make sense to have both of these:
> 
>     +               elog(ERROR, "cache lookup failed for schema variable
>     %u", varid);
>     +               elog(ERROR, "cache lookup failed for variable %u",
>     varid);
> 
>      > Yes, there is inconsistency, but I think it is necessary. The name
>      > "variable" is too generic. Theoretically we can use other
>     adjectives like
>      > session variables or global variables and the name will be valid.
>     But it
>      > doesn't describe the fundamentals of design. This is similar to the
>      > package's variables from PL/SQL. These variables are global,
>     session's
>      > variables too. But the usual name is "package variables". So schema
>      > variables are assigned to schemes, and I think a good name can be
>     "schema
>      > variables". But it is not necessary to repeat keyword schema in
>     the CREATE
>      > COMMAND.
>      >
>      > My opinion is not too strong in this case, and I can accept just
>      > "variables" or "session's variables" or "global variables", but I
>     am not
>      > sure if these names describe this feature well, because still
>     they are too
>      > generic. There are too many different implementations of session
>     global
>      > variables (see PL/SQL or T-SQL or DB2).
> 
>     I would prefer "session variable".
> 
>     To me, this feature seems similar to a CTE (which exists for a single
>     statement), or a temporary table (which exists for a single
>     transaction).  So
>     "session" conveys a lot more of its meaning than "schema".
> 
> 
> It depends on where you are looking. There are two perspectives - data 
> and metadata. And if I use data perspective, then it is session related. 
> If I use metadata perspective, then it can be persistent or temporal 
> like tables.

I think you mean "temporary" not "temporal". This really confused me for 
a while, because temporal means "involving time" (e.g. a table with 
from/to timestamp range, etc).

> I see strong similarity with Global Temporary Tables - but 
> I think naming "local temporary tables" and "global temporary tables" 
> can be not intuitive or messy for a lot of people too.

Right, it's a bit like global temporary tables, in the sense that 
there's a shared definition but local (session) state.

> Anyway, if people will try to find this feature on Google, then 
> probably use keywords "session variables", so maybe my preference of
> more technical terminology is obscure and not practical, and the name
> "session variables" can be more practical for other people.
Hmmm, maybe.

> If I use the system used for GTT - then the exact name can be "Global
> Session Variable". Can we use this name? Or shortly just Session
> Variables because we don't support local session variables now.

So a "local variable" would be defined just for a given session, just 
like a temporary table? Wouldn't that have the same issues with catalog 
bloat as temporary tables?

I'd probably vote for "session variables". We can call it local/global 
session variables in the future, if we end up implementing that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Schema variables - new implementation for Postgres 15

From
Tomas Vondra
Date:
On 11/6/21 04:45, Pavel Stehule wrote:
> Hi
> 
> st 3. 11. 2021 v 14:05 odesílatel Tomas Vondra 
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> 
> napsal:
> 
>     Hi,
> 
>     I took a quick look at the latest patch version. In general the patch
>     looks pretty complete and clean, and for now I have only some basic
>     comments. The attached patch tweaks some of this, along with a couple
>     additional minor changes that I'll not discuss here.
> 
> 
>     1) Not sure why we need to call this "schema variables". Most objects
>     are placed in a schema, and we don't say "schema tables" for example.
>     And it's CREATE VARIABLE and not CREATE SCHEMA VARIABLE, so it's a bit
>     inconsistent.
> 
> 
> Yes, there is inconsistency, but I think it is necessary. The name 
> "variable" is too generic. Theoretically we can use other adjectives 
> like session variables or global variables and the name will be valid. 
> But it doesn't describe the fundamentals of design. This is similar to 
> the package's variables from PL/SQL. These variables are global, 
> session's variables too. But the usual name is "package variables". So 
> schema variables are assigned to schemes, and I think a good name can be 
> "schema variables". But it is not necessary to repeat keyword schema in 
> the CREATE COMMAND.
> 
> My opinion is not too strong in this case, and I can accept just 
> "variables" or "session's variables" or "global variables", but I am not 
> sure if these names describe this feature well, because still they are 
> too generic. There are too many different implementations of session 
> global variables (see PL/SQL or T-SQL or DB2).
> 

OK. "Session variable" seems better to me, but I'm not sure how well 
that matches other databases. I'm not sure how much should we feel 
constrained by naming in other databases, though.

> 
>     The docs actually use "Global variables" in one place for some reason.
> 
> 
>     2) I find this a bit confusing:
> 
>     SELECT non_existent_variable;
>     test=# select s;
>     ERROR:  column "non_existent_variable" does not exist
>     LINE 1: select non_existent_variable;
> 
>     I wonder if this means using SELECT to read variables is a bad idea, and
>     we should have a separate command, just like we have LET (instead of
>     just using UPDATE in some way).
> 
> 
> I am sure so I want to use variables in SELECTs. One interesting case is 
> using variables in RLS.
> 

How much more complicated would it be without the SELECT?

> I prefer to fix this error message to "column or variable ... does not 
> exist"
> 

Not sure it's a good idea to make the error message more ambiguous. Most 
people won't use variables at all, and the message will be less clear 
for them.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 7. 11. 2021 v 22:36 odesílatel Tomas Vondra <tomas.vondra@enterprisedb.com> napsal:
On 11/6/21 04:45, Pavel Stehule wrote:
> Hi
>
> st 3. 11. 2021 v 14:05 odesílatel Tomas Vondra
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
> napsal:
>
>     Hi,
>
>     I took a quick look at the latest patch version. In general the patch
>     looks pretty complete and clean, and for now I have only some basic
>     comments. The attached patch tweaks some of this, along with a couple
>     additional minor changes that I'll not discuss here.
>
>
>     1) Not sure why we need to call this "schema variables". Most objects
>     are placed in a schema, and we don't say "schema tables" for example.
>     And it's CREATE VARIABLE and not CREATE SCHEMA VARIABLE, so it's a bit
>     inconsistent.
>
>
> Yes, there is inconsistency, but I think it is necessary. The name
> "variable" is too generic. Theoretically we can use other adjectives
> like session variables or global variables and the name will be valid.
> But it doesn't describe the fundamentals of design. This is similar to
> the package's variables from PL/SQL. These variables are global,
> session's variables too. But the usual name is "package variables". So
> schema variables are assigned to schemes, and I think a good name can be
> "schema variables". But it is not necessary to repeat keyword schema in
> the CREATE COMMAND.
>
> My opinion is not too strong in this case, and I can accept just
> "variables" or "session's variables" or "global variables", but I am not
> sure if these names describe this feature well, because still they are
> too generic. There are too many different implementations of session
> global variables (see PL/SQL or T-SQL or DB2).
>

OK. "Session variable" seems better to me, but I'm not sure how well
that matches other databases. I'm not sure how much should we feel
constrained by naming in other databases, though.

session variables is generic term - there are big differences already - T-SQL versus PL/SQL or SQL+ or DB2


>
>     The docs actually use "Global variables" in one place for some reason.
>
>
>     2) I find this a bit confusing:
>
>     SELECT non_existent_variable;
>     test=# select s;
>     ERROR:  column "non_existent_variable" does not exist
>     LINE 1: select non_existent_variable;
>
>     I wonder if this means using SELECT to read variables is a bad idea, and
>     we should have a separate command, just like we have LET (instead of
>     just using UPDATE in some way).
>
>
> I am sure so I want to use variables in SELECTs. One interesting case is
> using variables in RLS.
>

How much more complicated would it be without the SELECT?

It is not too complicated, just you want to introduce SELECT2. The sense of session variables is to be used. Has no sense to hold a value on a server without the possibility to use it.

Session variables can be used as global variables in PL/pgSQL. If you cannot use it in SQL expressions, then you need to copy it to a local variable, and then you can use it. That cannot work. This design is a replacement of a untyped not nullable slow workaround based on GUC, there is a necessity to use it in SQL.


> I prefer to fix this error message to "column or variable ... does not
> exist"
>

Not sure it's a good idea to make the error message more ambiguous. Most
people won't use variables at all, and the message will be less clear
for them.

Yes, there is new complexity. But it is an analogy with variables in PL/pgSQL with all benefits and negatives. You don't want to use dynamic SQL everywhere you use PL/pgSQL variables.

There are more cases than RLS in SQL

1. hold value in session (for interactive work or for non interactive scripts). Sometimes you want to reuse value - we can now use CTE or temporary tables. But in this case you have to store relation, you cannot store value, that can be used as a query parameter.

2. allow safe and effective parametrization of SQL scripts, and copy value from client side to server side (there is not risk of SQL injection).

run script with parameter -v xx=10

```
create temp variable xx as int;
set xx = :`xx`;
do $$
  .. -- I can work with variable xx on server side

  ...

$$

This is complement to client side variables - the advantage is possibility to use outside psql, the are type, and the metadata can be permanent.

3. you can share value by PL environments (and by possible clients). But this sharing is secure - the rules are the same like holding value in an table.

Session variables increase complexity a little bit, but increases possibilities and comfort for developers that use databases directly. The analogy with PL/pgSQL variables is well, jut you are not limited to PL/pgSQL scope.

Regards

Pavel




regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi Justin

so 6. 11. 2021 v 2:39 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:
On Wed, Nov 03, 2021 at 02:05:02PM +0100, Tomas Vondra wrote:
> 3) I've reworded / tweaked a couple places in the docs, but this really
> needs a native speaker - I don't have a very good "feeling" for this
> technical language so it's probably still quite cumbersome.

On Daniel's suggestion, I have reviewed the docs, and then proofread the rest
of the patch.  My amendments are in 0003.

Thank you for review and fixes, I try to complete some version for next work, and looks so your patch 0001 is broken

gedit reports to me broken unicode \A0\A0\A0\A0\A0

my last patch has 276KB and your patch has 293KB?

Thank you

Pavel


--
Justin

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:




my last patch has 276KB and your patch has 293KB?

Please, can you resend your version of patch 0001?

Thank you

Pavel


 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


po 15. 11. 2021 v 21:23 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:
On Mon, Nov 15, 2021 at 09:00:13PM +0100, Pavel Stehule wrote:
> Thank you for review and fixes, I try to complete some version for next
> work, and looks so your patch 0001 is broken
>
> gedit reports to me broken unicode \A0\A0\A0\A0\A0
>
> my last patch has 276KB and your patch has 293KB?

On Mon, Nov 15, 2021 at 09:06:08PM +0100, Pavel Stehule wrote:
> >
> > my last patch has 276KB and your patch has 293KB?
>
> Please, can you resend your version of patch 0001?

https://www.postgresql.org/message-id/20211106013904.GG17618@telsasoft.com

0001 is exactly your patch applied to HEAD, and 0002 are Tomas' changes
relative to your patch.

0003 is my contribution on top.  My intent is that you wouldn't apply 0001, but
rather apply my 0003 on top of your existing branch, and then review 0002/0003,
and then squish the changes into your patch.

I see the 0xa0 stuff in your original patch before my changes, but I'm not sure
what went wrong.

Let me know if you have any issue applying my changes on top of your existing,
local branch ?

It is ok, I was able to apply all your patches to my local branch

Regards

Pavel

--
Justin

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:

Hi

8) immutable variables without a default value

IMO this case should not be allowed. On 2021/08/29 you wrote:

    I thought about this case, and I have one scenario, where this
    behaviour can be useful. When the variable is declared as IMMUTABLE
    NOT NULL without not null default, then any access to the content of
    the variable has to fail. I think it can be used for detection,
    where and when the variable is first used. So this behavior is
    allowed just because I think, so this feature can be interesting for
    debugging. If this idea is too strange, I have no problem to disable
    this case.

I checked code, and this case is disallowed already

postgres=# CREATE IMMUTABLE VARIABLE xx AS int NOT NULL;
ERROR:  IMMUTABLE NOT NULL variable requires default expression

Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

po 15. 11. 2021 v 21:23 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:
On Mon, Nov 15, 2021 at 09:00:13PM +0100, Pavel Stehule wrote:
> Thank you for review and fixes, I try to complete some version for next
> work, and looks so your patch 0001 is broken
>
> gedit reports to me broken unicode \A0\A0\A0\A0\A0
>
> my last patch has 276KB and your patch has 293KB?

On Mon, Nov 15, 2021 at 09:06:08PM +0100, Pavel Stehule wrote:
> >
> > my last patch has 276KB and your patch has 293KB?
>
> Please, can you resend your version of patch 0001?

https://www.postgresql.org/message-id/20211106013904.GG17618@telsasoft.com

0001 is exactly your patch applied to HEAD, and 0002 are Tomas' changes
relative to your patch.

0003 is my contribution on top.  My intent is that you wouldn't apply 0001, but
rather apply my 0003 on top of your existing branch, and then review 0002/0003,
and then squish the changes into your patch.

I see the 0xa0 stuff in your original patch before my changes, but I'm not sure
what went wrong.

Let me know if you have any issue applying my changes on top of your existing,
local branch ?

I am sending new versions of patches.

I hope I solved all Tomas's objections.

1. The schema variables were renamed to session variables
2. I fixed issues related to creating, dropping variables under subtransactions + regress tests
3. I fixed issues in pg_dump + regress tests

Regards

Pavel


--
Justin
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
Op 19-12-2021 om 07:23 schreef Pavel Stehule:

> 
> I am sending new versions of patches.
> 
> I hope I solved all Tomas's objections.
> 
> 1. The schema variables were renamed to session variables
> 2. I fixed issues related to creating, dropping variables under 
> subtransactions + regress tests
> 3. I fixed issues in pg_dump + regress tests
> 

 > [0001-schema-variables-20211219.patch]
 > [0002-schema-variables-20211219.patch]

Hi Pavel,

I get an error during test 'session_variables'.

(on the upside, my own little testsuite runs without error)

thanks,

Erik Rijkers










Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 19. 12. 2021 v 8:09 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
Op 19-12-2021 om 07:23 schreef Pavel Stehule:

>
> I am sending new versions of patches.
>
> I hope I solved all Tomas's objections.
>
> 1. The schema variables were renamed to session variables
> 2. I fixed issues related to creating, dropping variables under
> subtransactions + regress tests
> 3. I fixed issues in pg_dump + regress tests
>

 > [0001-schema-variables-20211219.patch]
 > [0002-schema-variables-20211219.patch]

Hi Pavel,

I get an error during test 'session_variables'.

(on the upside, my own little testsuite runs without error)

thanks,

please, can you send me regress diff?

Regards

Pavel



Erik Rijkers









Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 19. 12. 2021 v 8:13 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


ne 19. 12. 2021 v 8:09 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
Op 19-12-2021 om 07:23 schreef Pavel Stehule:

>
> I am sending new versions of patches.
>
> I hope I solved all Tomas's objections.
>
> 1. The schema variables were renamed to session variables
> 2. I fixed issues related to creating, dropping variables under
> subtransactions + regress tests
> 3. I fixed issues in pg_dump + regress tests
>

 > [0001-schema-variables-20211219.patch]
 > [0002-schema-variables-20211219.patch]

Hi Pavel,

I get an error during test 'session_variables'.

(on the upside, my own little testsuite runs without error)

thanks,

please, can you send me regress diff?

I see the problem now, the test contains username, and that is wrong.

Schema | Name | Type | Is nullable | Is mutable | Default | Owner | Transactional end action | Access privileges | Description
-----------+------+---------+-------------+------------+---------+-------+--------------------------+------------------------+-------------
- svartest | var1 | numeric | t | t | | pavel | | pavel=SW/pavel +|
- | | | | | | | | var_test_role=SW/pavel |
+----------+------+---------+-------------+------------+---------+----------+--------------------------+---------------------------+-------------
+ svartest | var1 | numeric | t | t | | appveyor | | appveyor=SW/appveyor +|
+ | | | | | | | | var_test_role=SW/appveyor |
(1 row)
REVOKE ALL ON VARIABLE var1 FROM var_test_role;

I have to remove this test

Pavel

Regards

Pavel



Erik Rijkers









Re: Schema variables - new implementation for Postgres 15

From
Erikjan Rijkers
Date:
Op 19-12-2021 om 08:13 schreef Pavel Stehule:
> 
> 
> ne 19. 12. 2021 v 8:09 odesílatel Erik Rijkers <er@xs4all.nl 
>      >
> 
>       > [0001-schema-variables-20211219.patch]
>       > [0002-schema-variables-20211219.patch]
> 
>     Hi Pavel,
> 
>     I get an error during test 'session_variables'.
> 
>     (on the upside, my own little testsuite runs without error)
> 
>     thanks,
> 
> 
> please, can you send me regress diff?
> 

I did attach it but if you did not receive it, see also cfbot, especially

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.156992


Erik




Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 19. 12. 2021 v 8:23 odesílatel Erikjan Rijkers <er@xs4all.nl> napsal:
Op 19-12-2021 om 08:13 schreef Pavel Stehule:
>
>
> ne 19. 12. 2021 v 8:09 odesílatel Erik Rijkers <er@xs4all.nl
>      >
>
>       > [0001-schema-variables-20211219.patch]
>       > [0002-schema-variables-20211219.patch]
>
>     Hi Pavel,
>
>     I get an error during test 'session_variables'.
>
>     (on the upside, my own little testsuite runs without error)
>
>     thanks,
>
>
> please, can you send me regress diff?
>

I did attach it but if you did not receive it, see also cfbot, especially

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.156992

second try

I removed badly written tests

Pavel




Erik

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
 > [0001-schema-variables-20211219-2.patch]
 > [0002-schema-variables-20211219-2.patch]

Hi Pavel,

You said earlier
 > 1. The schema variables were renamed to session variable

But I still see:
$ grep -Eic 'schema variable' postgres.html
15

(postgres.html from 'make postgres.html')

So that rename doesn't seem finished.


Erik






Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

ne 19. 12. 2021 v 11:10 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
 > [0001-schema-variables-20211219-2.patch]
 > [0002-schema-variables-20211219-2.patch]

Hi Pavel,

You said earlier
 > 1. The schema variables were renamed to session variable

But I still see:
$ grep -Eic 'schema variable' postgres.html
15

(postgres.html from 'make postgres.html')

So that rename doesn't seem finished.

Yes, I forgot some changes, and more, there was a bogus regress result file. Thank you for rechecking.

I am sending cleaned patches

Regards

Pavel




Erik



Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

new update - I found an error in checking ambiguous columns - the tupdesc was badly released by FreeTupleDesc. I fixed this issue and did a new related regress test to cover this path.

Regards

Nice holidays

Pavel


Attachment

Fwd: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


---------- Forwarded message ---------
Od: Pavel Stehule <pavel.stehule@gmail.com>
Date: po 27. 12. 2021 v 5:30
Subject: Re: Schema variables - new implementation for Postgres 15
To: Justin Pryzby <pryzby@telsasoft.com>


Hi

ne 26. 12. 2021 v 15:43 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:
> > Maybe "session variable" should be added to the glossary.
>
> done

+     A persistent database object that holds an value in session memory.
+     This memory is not shared across sessions, and after session end, this
+     memory (the value) is released. The access (read or write) to session variables
+     is controlled by access rigths similary to other database object access rigts.

an value => a value
rigths => rights
rigts => rights

fixed

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Dean Rasheed
Date:
On Wed, 3 Nov 2021 at 13:05, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>
> 2) I find this a bit confusing:
>
> SELECT non_existent_variable;
> test=# select s;
> ERROR:  column "non_existent_variable" does not exist
> LINE 1: select non_existent_variable;
>
> I wonder if this means using SELECT to read variables is a bad idea, and
> we should have a separate command, just like we have LET (instead of
> just using UPDATE in some way).
>

Hmm. This way of reading variables worries me for a different reason
-- I think it makes it all too easy to break existing applications by
inadvertently (or deliberately) defining variables that conflict with
column names referred to in existing queries.

For example, if I define a variable called "relkind", then psql's \sv
meta-command is broken because the query it performs can't distinguish
between the column and the variable.

Similarly, there's ambiguity between alias.colname and
schema.variablename. So, for example, if I do the following:

CREATE SCHEMA n;
CREATE VARIABLE n.nspname AS int;

then lots of things are broken, including pg_dump and a number of psql
meta-commands. I don't think it's acceptable to make it so easy for a
user to break the system in this way.

Those are examples that a malicious user might use, but even without
such examples, I think it would be far too easy to inadvertently break
a large application by defining a variable that conflicted with a
column name you didn't know about.

Regards,
Dean



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 13. 1. 2022 v 13:54 odesílatel Dean Rasheed <dean.a.rasheed@gmail.com> napsal:
On Wed, 3 Nov 2021 at 13:05, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>
> 2) I find this a bit confusing:
>
> SELECT non_existent_variable;
> test=# select s;
> ERROR:  column "non_existent_variable" does not exist
> LINE 1: select non_existent_variable;
>
> I wonder if this means using SELECT to read variables is a bad idea, and
> we should have a separate command, just like we have LET (instead of
> just using UPDATE in some way).
>

Hmm. This way of reading variables worries me for a different reason
-- I think it makes it all too easy to break existing applications by
inadvertently (or deliberately) defining variables that conflict with
column names referred to in existing queries.

For example, if I define a variable called "relkind", then psql's \sv
meta-command is broken because the query it performs can't distinguish
between the column and the variable.

Similarly, there's ambiguity between alias.colname and
schema.variablename. So, for example, if I do the following:

CREATE SCHEMA n;
CREATE VARIABLE n.nspname AS int;

then lots of things are broken, including pg_dump and a number of psql
meta-commands. I don't think it's acceptable to make it so easy for a
user to break the system in this way.

Those are examples that a malicious user might use, but even without
such examples, I think it would be far too easy to inadvertently break
a large application by defining a variable that conflicted with a
column name you didn't know about.

This is a valid issue, and it should be solved, or reduce a risk

I see two possibilities

a) easy solution can be implementation of other conflict strategy - variables have lower priority than tables with possibility to raise warnings if some identifiers are ambiguous. This is easy to implement, and with warning I think there should not be some unwanted surprises for developers. This is safe in meaning - no variable can break any query.

b) harder implementation (but I long think about it) can be implementation of schema scope access. It can be used for implementation of schema private objects. It doesn't solve the described issue, but it can reduce the risk of collision just for one schema.

Both possibilities can be implemented together - but the @b solution should be implemented from zero - and it is more generic concept, and then I prefer @a

Dean, can @a work for you?

Regards

Pavel



Regards,
Dean

Re: Schema variables - new implementation for Postgres 15

From
"Joel Jacobson"
Date:
On Thu, Jan 13, 2022, at 18:24, Dean Rasheed wrote:
> Those are examples that a malicious user might use, but even without
> such examples, I think it would be far too easy to inadvertently break
> a large application by defining a variable that conflicted with a
> column name you didn't know about.

I think there is also a readability problem with the non-locality of this feature.

I think it would be better to have an explicit namespace for these global variables, so that when reading code, they would stand-out.
As a bonus, that would also solve the risk of breaking code, as you pointed out.

Most code should never need any global variables at all, so in the rare occasions when they are needed, I think it's perfectly fine if some more verbose fully-qualified syntax was needed to use them, rather than to pollute the namespace and risk breaking code.

I want to bring up an idea presented earlier in a different thread:

How about exploiting reserved SQL keywords followed by a dot, as special labels?

This could solve the problem with this patch, as well as the other root label patch to access function parameters.

It's an unorthodox idea, but due to legacy, I think we need to be creative, if we want a safe solution with no risk of breaking any code, which I think should be a requirement.

Taking inspiration from Javascript, how about using the SQL reserved keyword "window"?
In Javascript, "window.variableName" means that the variable variableName declared at the global scope.

Furthermore:

"from" could be used to access function/procedure IN parameters.
"to" could be used to access function OUT parameters.
"from" or "to" could be used to access function INOUT parameters.

Examples:

SELECT u.user_id
INTO to.user_id
FROM users u
WHERE u.username = from.username;

-- After authentication, the authenticated user_id could be stored as a global variable:
window.user_id := to.user_id;

-- The authenticated user_id could then be used in queries that should filter on user_id:
SELECT o.order_id
FROM orders o
WHERE o.user_id = window.user_id;

This would require endorsement from the SQL committee of course, otherwise we would face problems if they suddenly would introduce syntax where a reserved keyword could be followed by a dot.

I think from a readability perspective, it works, since the different meanings can be distinguished by writing one in UPPERCASE and the other in lowercase.

/Joel

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 13. 1. 2022 v 15:29 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Thu, Jan 13, 2022, at 18:24, Dean Rasheed wrote:
> Those are examples that a malicious user might use, but even without
> such examples, I think it would be far too easy to inadvertently break
> a large application by defining a variable that conflicted with a
> column name you didn't know about.

I think there is also a readability problem with the non-locality of this feature.

I think it would be better to have an explicit namespace for these global variables, so that when reading code, they would stand-out.
As a bonus, that would also solve the risk of breaking code, as you pointed out.

Most code should never need any global variables at all, so in the rare occasions when they are needed, I think it's perfectly fine if some more verbose fully-qualified syntax was needed to use them, rather than to pollute the namespace and risk breaking code.

There are few absolutely valid use cases

1. scripting - currently used GUC instead session variables are slow, and without types

2. RLS

3. Migration from Oracle - although I agree, so package variables are used more times badly, it used there. And only in few times is possibility to refactor code when you do migration from Oracle to Postgres, and there is necessity to have session variables,


I want to bring up an idea presented earlier in a different thread:

How about exploiting reserved SQL keywords followed by a dot, as special labels?

This could solve the problem with this patch, as well as the other root label patch to access function parameters.

It's an unorthodox idea, but due to legacy, I think we need to be creative, if we want a safe solution with no risk of breaking any code, which I think should be a requirement.

Taking inspiration from Javascript, how about using the SQL reserved keyword "window"?
In Javascript, "window.variableName" means that the variable variableName declared at the global scope.

I cannot imagine how the "window" keyword can work in SQL context. In Javascript "window" is an object - it is not a keyword, and it makes sense in usual Javascript context inside HTML browsers.

Regards

Pavel

 

Furthermore:

"from" could be used to access function/procedure IN parameters.
"to" could be used to access function OUT parameters.
"from" or "to" could be used to access function INOUT parameters.

Examples:

SELECT u.user_id
INTO to.user_id
FROM users u
WHERE u.username = from.username;

-- After authentication, the authenticated user_id could be stored as a global variable:
window.user_id := to.user_id;

-- The authenticated user_id could then be used in queries that should filter on user_id:
SELECT o.order_id
FROM orders o
WHERE o.user_id = window.user_id;

This would require endorsement from the SQL committee of course, otherwise we would face problems if they suddenly would introduce syntax where a reserved keyword could be followed by a dot.

I think from a readability perspective, it works, since the different meanings can be distinguished by writing one in UPPERCASE and the other in lowercase.

/Joel

Re: Schema variables - new implementation for Postgres 15

From
"Joel Jacobson"
Date:
On Thu, Jan 13, 2022, at 20:12, Pavel Stehule wrote:
>I cannot imagine how the "window" keyword can work in SQL context. In Javascript "window" is an object - it is not a keyword, and it makes sense in usual Javascript context inside HTML browsers.

I was thinking since Javascript is by far the most known programming language, the "window" word would be familiar and easy to remember, but I agree, it's not perfect.

Hm, "values" would be nice, it's reserved in SQL:2016 [1] and in DB2/Mimer/MySQL/Oracle/SQL Server/Teradata [2], but unfortunately not in PostgreSQL [1], so perhaps not doable.

Syntax:

values.[schema name].[variable name]


Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 13. 1. 2022 v 18:01 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Thu, Jan 13, 2022, at 20:12, Pavel Stehule wrote:
>I cannot imagine how the "window" keyword can work in SQL context. In Javascript "window" is an object - it is not a keyword, and it makes sense in usual Javascript context inside HTML browsers.

I was thinking since Javascript is by far the most known programming language, the "window" word would be familiar and easy to remember, but I agree, it's not perfect.

Mainly the "window" is just a global variable. It is not a special keyword. So the syntax object.property is usual.


Hm, "values" would be nice, it's reserved in SQL:2016 [1] and in DB2/Mimer/MySQL/Oracle/SQL Server/Teradata [2], but unfortunately not in PostgreSQL [1], so perhaps not doable.

Syntax:

values.[schema name].[variable name]

This doesn't help too much. This syntax is too long. It can solve the described issue, but only when all three parts will be required, and writing every time VALUES.schemaname.variablename is not too practical. And if we require this three part identifier every time, then it can be used with the already supported dbname.schemaname.varname. Almost all collisions can be fixed by using a three part identifier. But it doesn't look too handy.

I like the idea of prioritizing tables over variables with warnings when collision is detected. It cannot break anything. And it allows to using short identifiers when there is not collision. If somebody don't want to any collision then can use schema "vars", "values", or what he/she likes. It is near to your proposal - it is not too often so people use table alias like "values" (although in EAV case it is possible).



Re: Schema variables - new implementation for Postgres 15

From
Dean Rasheed
Date:
On Thu, 13 Jan 2022 at 17:42, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> I like the idea of prioritizing tables over variables with warnings when collision is detected. It cannot break
anything.And it allows to using short identifiers when there is not collision.
 

Yeah, that seems OK, as long as it's clearly documented. I don't think
a warning is necessary.

(FWIW, testing with dbfiddle, that appears to match Db2's behaviour).

Regards,
Dean



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 13. 1. 2022 v 19:23 odesílatel Dean Rasheed <dean.a.rasheed@gmail.com> napsal:
On Thu, 13 Jan 2022 at 17:42, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> I like the idea of prioritizing tables over variables with warnings when collision is detected. It cannot break anything. And it allows to using short identifiers when there is not collision.

Yeah, that seems OK, as long as it's clearly documented. I don't think
a warning is necessary.

The warning can be disabled by default, but I think it should be there. This is a signal, so some in the database schema should be renamed. Maybe - session_variables_ambiguity_warning.


(FWIW, testing with dbfiddle, that appears to match Db2's behaviour).

Thank you for check

Regards

Pavel


Regards,
Dean

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Thu, Jan 13, 2022 at 07:32:26PM +0100, Pavel Stehule wrote:
> čt 13. 1. 2022 v 19:23 odesílatel Dean Rasheed <dean.a.rasheed@gmail.com>
> napsal:
> 
> > On Thu, 13 Jan 2022 at 17:42, Pavel Stehule <pavel.stehule@gmail.com>
> > wrote:
> > >
> > > I like the idea of prioritizing tables over variables with warnings when
> > collision is detected. It cannot break anything. And it allows to using
> > short identifiers when there is not collision.
> >
> > Yeah, that seems OK, as long as it's clearly documented. I don't think
> > a warning is necessary.

What should be the behavior for a cached plan that uses a variable when a
conflicting relation is later created?  I think that it should be the same as a
search_path change and the plan should be discarded.

> The warning can be disabled by default, but I think it should be there.
> This is a signal, so some in the database schema should be renamed. Maybe -
> session_variables_ambiguity_warning.

I agree that having a way to know that a variable has been bypassed can be
useful.

> > (FWIW, testing with dbfiddle, that appears to match Db2's behaviour).
> >
> 
> Thank you for check

Do you know what's oracle's behavior on that?


I've been looking at the various dependency handling, and I noticed that
collation are ignored, while they're accepted syntax-wise:

=# create collation mycollation (locale = 'fr-FR', provider = 'icu');
CREATE COLLATION

=# create variable myvariable text collate mycollation;
CREATE VARIABLE

=# select classid::regclass, objid, objsubid, refclassid::regclass, refobjid, refobjsubid from pg_depend where
classid::regclass::text= 'pg_variable' or refclassid::regclass::text = 'pg_variable';
 
   classid   | objid | objsubid |  refclassid  | refobjid | refobjsubid
-------------+-------+----------+--------------+----------+-------------
 pg_variable | 16407 |        0 | pg_namespace |     2200 |           0
(1 row)

=# let myvariable = 'AA';
LET

=# select 'AA' collate "en-x-icu" < myvariable;
 ?column?
----------
 f
(1 row)

=# select 'AA' collate "en-x-icu" < myvariable collate mycollation;
ERROR:  42P21: collation mismatch between explicit collations "en-x-icu" and "mycollation"
LINE 1: select 'AA' collate "en-x-icu" < myvariable collate mycollat...

So it's missing both dependency recording for variable's collation and also
teaching various code that variables can have a collation.

It's also missing some invalidation detection.  For instance:

=# create variable myval text;
CREATE VARIABLE

=# let myval = 'pg_class';
LET

=# prepare s(text) as select relname from pg_class where relname = $1 or relname = myval;
PREPARE

=# set plan_cache_mode = force_generic_plan ;
SET

=# execute s ('');
 relname
----------
 pg_class
(1 row)

=# drop variable myval ;
DROP VARIABLE

=# create variable myval int;
CREATE VARIABLE

=# execute s ('');
ERROR:  XX000: cache lookup failed for session variable 16408

The plan should have been discarded and the new plan should fail for type
problem.

Strangely, subsequent calls don't error out:

=# execute s('');
 relname
---------
(0 rows)

But doing an explain shows that there's a problem:

=# explain execute s('');
ERROR:  XX000: cache lookup failed for variable 16408



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 14. 1. 2022 v 3:44 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Thu, Jan 13, 2022 at 07:32:26PM +0100, Pavel Stehule wrote:
> čt 13. 1. 2022 v 19:23 odesílatel Dean Rasheed <dean.a.rasheed@gmail.com>
> napsal:
>
> > On Thu, 13 Jan 2022 at 17:42, Pavel Stehule <pavel.stehule@gmail.com>
> > wrote:
> > >
> > > I like the idea of prioritizing tables over variables with warnings when
> > collision is detected. It cannot break anything. And it allows to using
> > short identifiers when there is not collision.
> >
> > Yeah, that seems OK, as long as it's clearly documented. I don't think
> > a warning is necessary.

What should be the behavior for a cached plan that uses a variable when a
conflicting relation is later created?  I think that it should be the same as a
search_path change and the plan should be discarded.

This is a more generic problem - creating a new DDL object doesn't invalidate plans.


 

> The warning can be disabled by default, but I think it should be there.
> This is a signal, so some in the database schema should be renamed. Maybe -
> session_variables_ambiguity_warning.

I agree that having a way to know that a variable has been bypassed can be
useful.

> > (FWIW, testing with dbfiddle, that appears to match Db2's behaviour).
> >
>
> Thank you for check

Do you know what's oracle's behavior on that?


Oracle is very different, because package variables are not visible from plain SQL. And change of interface invalidates dependent objects and requires recompilation. So it is a little bit more sensitive. If I remember well, the SQL identifiers have bigger priority than PL/SQL identifiers (package variables), so proposed behavior is very similar to Oracle behavior too. The risk of unwanted collision is reduced (on Oracle) by local visibility of package variables, and availability of package variables only in some environments.

 

I've been looking at the various dependency handling, and I noticed that
collation are ignored, while they're accepted syntax-wise:

=# create collation mycollation (locale = 'fr-FR', provider = 'icu');
CREATE COLLATION

=# create variable myvariable text collate mycollation;
CREATE VARIABLE

=# select classid::regclass, objid, objsubid, refclassid::regclass, refobjid, refobjsubid from pg_depend where classid::regclass::text = 'pg_variable' or refclassid::regclass::text = 'pg_variable';
   classid   | objid | objsubid |  refclassid  | refobjid | refobjsubid
-------------+-------+----------+--------------+----------+-------------
 pg_variable | 16407 |        0 | pg_namespace |     2200 |           0
(1 row)

=# let myvariable = 'AA';
LET

=# select 'AA' collate "en-x-icu" < myvariable;
 ?column?
----------
 f
(1 row)

=# select 'AA' collate "en-x-icu" < myvariable collate mycollation;
ERROR:  42P21: collation mismatch between explicit collations "en-x-icu" and "mycollation"
LINE 1: select 'AA' collate "en-x-icu" < myvariable collate mycollat...

So it's missing both dependency recording for variable's collation and also
teaching various code that variables can have a collation.

It's also missing some invalidation detection.  For instance:

=# create variable myval text;
CREATE VARIABLE

=# let myval = 'pg_class';
LET

=# prepare s(text) as select relname from pg_class where relname = $1 or relname = myval;
PREPARE

=# set plan_cache_mode = force_generic_plan ;
SET

=# execute s ('');
 relname
----------
 pg_class
(1 row)

=# drop variable myval ;
DROP VARIABLE

=# create variable myval int;
CREATE VARIABLE

=# execute s ('');
ERROR:  XX000: cache lookup failed for session variable 16408

The plan should have been discarded and the new plan should fail for type
problem.

Strangely, subsequent calls don't error out:

=# execute s('');
 relname
---------
(0 rows)

But doing an explain shows that there's a problem:

=# explain execute s('');
ERROR:  XX000: cache lookup failed for variable 16408

looks like bug

Regards

Pavel
 

Re: Schema variables - new implementation for Postgres 15

From
Marcos Pegoraro
Date:
For example, if I define a variable called "relkind", then psql's \sv
meta-command is broken because the query it performs can't distinguish
between the column and the variable.

If variables use : as prefix you´ll never have these conflicts.

select relkind from pg_class where relkind = :relkind

 

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Fri, Jan 14, 2022 at 07:49:09AM -0300, Marcos Pegoraro wrote:
> >
> > For example, if I define a variable called "relkind", then psql's \sv
> > meta-command is broken because the query it performs can't distinguish
> > between the column and the variable.
> >
> If variables use : as prefix you´ll never have these conflicts.
> 
> select relkind from pg_class where relkind = :relkind

This is already used by psql client side variables, so this is not an option.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 14. 1. 2022 v 11:49 odesílatel Marcos Pegoraro <marcos@f10.com.br> napsal:
For example, if I define a variable called "relkind", then psql's \sv
meta-command is broken because the query it performs can't distinguish
between the column and the variable.

If variables use : as prefix you´ll never have these conflicts.

select relkind from pg_class where relkind = :relkind

This syntax is used for client side variables already.

This is similar to MSSQL or MySQL philosophy. But the disadvantage of this method is the impossibility of modularization - all variables are in one space (although there are nested scopes).

The different syntax disallows any collision well, it is far to what is more usual standard in this area. And if we introduce special syntax, then there is no way back. We cannot use :varname - this syntax is used already, but we can use, theoretically, @var or $var. But, personally, I don't want to use it, if there is possibility to do without it. The special syntax can be used maybe for direct access to function arguments, or for not persistent (temporal) session variables like MSSQL. There is a relatively big space of functionality for session variables, and the system that I used is based on ANSI SQL/PSM or DB2 and it is near to Oracle. It has a lot of advantages for writing stored procedures. On other hand, for adhoc work the session variables like MySQL (without declaration) can be handy, so I don't want to use (and block) syntax that can be used for something different.




 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

pá 14. 1. 2022 v 3:44 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Thu, Jan 13, 2022 at 07:32:26PM +0100, Pavel Stehule wrote:
> čt 13. 1. 2022 v 19:23 odesílatel Dean Rasheed <dean.a.rasheed@gmail.com>
> napsal:
>
> > On Thu, 13 Jan 2022 at 17:42, Pavel Stehule <pavel.stehule@gmail.com>
> > wrote:
> > >
> > > I like the idea of prioritizing tables over variables with warnings when
> > collision is detected. It cannot break anything. And it allows to using
> > short identifiers when there is not collision.
> >
> > Yeah, that seems OK, as long as it's clearly documented. I don't think
> > a warning is necessary.

What should be the behavior for a cached plan that uses a variable when a
conflicting relation is later created?  I think that it should be the same as a
search_path change and the plan should be discarded.

> The warning can be disabled by default, but I think it should be there.
> This is a signal, so some in the database schema should be renamed. Maybe -
> session_variables_ambiguity_warning.

I agree that having a way to know that a variable has been bypassed can be
useful.

done
 

> > (FWIW, testing with dbfiddle, that appears to match Db2's behaviour).
> >
>
> Thank you for check

Do you know what's oracle's behavior on that?


I've been looking at the various dependency handling, and I noticed that
collation are ignored, while they're accepted syntax-wise:

=# "
CREATE COLLATION

=# create variable myvariable text collate mycollation;
CREATE VARIABLE

=# select classid::regclass, objid, objsubid, refclassid::regclass, refobjid, refobjsubid from pg_depend where classid::regclass::text = 'pg_variable' or refclassid::regclass::text = 'pg_variable';
   classid   | objid | objsubid |  refclassid  | refobjid | refobjsubid
-------------+-------+----------+--------------+----------+-------------
 pg_variable | 16407 |        0 | pg_namespace |     2200 |           0
(1 row)

fixed
 

=# let myvariable = 'AA';
LET

=# select 'AA' collate "en-x-icu" < myvariable;
 ?column?
----------
 f
(1 row)

=# select 'AA' collate "en-x-icu" < myvariable collate mycollation;
ERROR:  42P21: collation mismatch between explicit collations "en-x-icu" and "mycollation"
LINE 1: select 'AA' collate "en-x-icu" < myvariable collate mycollat...

What do you expect?  I don't understand collating well, but it looks correct. Minimally the tables have the same behavior.

create collation mycollation (locale = 'fr-FR', provider = 'icu');
create table foo(mycol text collate mycollation);
select 'AA' collate "en-x-icu" < mycol from foo;
┌──────────┐
│ ?column? │
╞══════════╡
│ f        │
└──────────┘
(1 row)


postgres=# select 'AA' collate "en-x-icu" < mycol collate mycollation from foo;
ERROR:  collation mismatch between explicit collations "en-x-icu" and "mycollation"
LINE 1: select 'AA' collate "en-x-icu" < mycol collate mycollation f...
                                               ^




So it's missing both dependency recording for variable's collation and also
teaching various code that variables can have a collation.

It's also missing some invalidation detection.  For instance:

=# create variable myval text;
CREATE VARIABLE

=# let myval = 'pg_class';
LET

=# prepare s(text) as select relname from pg_class where relname = $1 or relname = myval;
PREPARE

=# set plan_cache_mode = force_generic_plan ;
SET

=# execute s ('');
 relname
----------
 pg_class
(1 row)

=# drop variable myval ;
DROP VARIABLE

=# create variable myval int;
CREATE VARIABLE

=# execute s ('');
ERROR:  XX000: cache lookup failed for session variable 16408

The plan should have been discarded and the new plan should fail for type
problem.

Strangely, subsequent calls don't error out:

=# execute s('');
 relname
---------
(0 rows)

But doing an explain shows that there's a problem:

=# explain execute s('');
ERROR:  XX000: cache lookup failed for variable 16408

fixed

Please, can you check the attached patches?

Regards

Pavel

 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


=# "
CREATE COLLATION

=# create variable myvariable text collate mycollation;
CREATE VARIABLE

=# select classid::regclass, objid, objsubid, refclassid::regclass, refobjid, refobjsubid from pg_depend where classid::regclass::text = 'pg_variable' or refclassid::regclass::text = 'pg_variable';
   classid   | objid | objsubid |  refclassid  | refobjid | refobjsubid
-------------+-------+----------+--------------+----------+-------------
 pg_variable | 16407 |        0 | pg_namespace |     2200 |           0
(1 row)

fixed
 

=# let myvariable = 'AA';
LET

=# select 'AA' collate "en-x-icu" < myvariable;
 ?column?
----------
 f
(1 row)

=# select 'AA' collate "en-x-icu" < myvariable collate mycollation;
ERROR:  42P21: collation mismatch between explicit collations "en-x-icu" and "mycollation"
LINE 1: select 'AA' collate "en-x-icu" < myvariable collate mycollat...

What do you expect?  I don't understand collating well, but it looks correct. Minimally the tables have the same behavior.

create collation mycollation (locale = 'fr-FR', provider = 'icu');
create table foo(mycol text collate mycollation);
select 'AA' collate "en-x-icu" < mycol from foo;
┌──────────┐
│ ?column? │
╞══════════╡
│ f        │
└──────────┘
(1 row)


postgres=# select 'AA' collate "en-x-icu" < mycol collate mycollation from foo;
ERROR:  collation mismatch between explicit collations "en-x-icu" and "mycollation"
LINE 1: select 'AA' collate "en-x-icu" < mycol collate mycollation f...
                                               ^


here is second test

postgres=# CREATE COLLATION nd2 (
  provider = 'icu',
  locale = '@colStrength=secondary', -- or 'und-u-ks-level2'
  deterministic = false
);
CREATE COLLATION
postgres=# create variable testv as text col

postgres=# create variable testv as text collate nd2;
CREATE VARIABLE
postgres=# let testv = 'Ahoj';
LET
postgres=# select testv = 'AHOJ';
┌──────────┐
│ ?column? │
╞══════════╡
│ t        │
└──────────┘
(1 row)

postgres=# select testv = 'AHOJ' collate "default";
┌──────────┐
│ ?column? │
╞══════════╡
│ f        │
└──────────┘
(1 row)

Regards

Pavel


 
 

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Tue, Jan 18, 2022 at 10:01:01PM +0100, Pavel Stehule wrote:
> pá 14. 1. 2022 v 3:44 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> 
> >
> > =# let myvariable = 'AA';
> > LET
> >
> > =# select 'AA' collate "en-x-icu" < myvariable;
> >  ?column?
> > ----------
> >  f
> > (1 row)
> >
> > =# select 'AA' collate "en-x-icu" < myvariable collate mycollation;
> > ERROR:  42P21: collation mismatch between explicit collations "en-x-icu"
> > and "mycollation"
> > LINE 1: select 'AA' collate "en-x-icu" < myvariable collate mycollat...
> >
> 
> What do you expect?  I don't understand collating well, but it looks
> correct. Minimally the tables have the same behavior.

Indeed, I actually didn't know that such object's collation were implicit and
could be overloaded without a problem as long as there's no conflict between
all the explicit collations.  So I agree that the current behavior is ok,
including a correct handling for wanted conflicts:

=# create variable var1 text collate "fr-x-icu";
CREATE VARIABLE

=# create variable var2 text collate "en-x-icu";
CREATE VARIABLE

=# let var1 = 'hoho';
LET

=# let var2 = 'hoho';
LET

=# select var1 < var2;
ERROR:  42P22: could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.

> Please, can you check the attached patches?

All the issue I mentioned are fixed, thanks!


I see a few problems with the other new features added though.  The new
session_variables_ambiguity_warning GUC is called even in contexts where it
shouldn't apply.  For instance:

=# set session_variables_ambiguity_warning = 1;
SET

=# create variable v text;
CREATE VARIABLE

=# DO $$
DECLARE v text;
BEGIN
v := 'test';
RAISE NOTICE 'v: %', v;
END;
$$ LANGUAGE plpgsql;
WARNING:  42702: session variable "v" is shadowed by column
LINE 1: v := 'test'
        ^
DETAIL:  The identifier can be column reference or session variable reference.
HINT:  The column reference is preferred against session variable reference.
QUERY:  v := 'test'

But this "v := 'test'" shouldn't be a substitute for a LET, and it indeed
doesn't work:

=# DO $$
BEGIN
v := 'test';
RAISE NOTICE 'v: %', v;
END;
$$ LANGUAGE plpgsql;
ERROR:  42601: "v" is not a known variable
LINE 3: v := 'test';

But the RAISE NOTICE does see the session variable (which should be the correct
behavior I think), so the warning should have been raised for this instruction
(and in that case the message is incorrect, as it's not shadowing a column).

Also, the pg_dump handling emits a COLLATION option for session variables even
for default collation, while it should only emit it if the collation is not the
type's default collation.  As a reference, for attributes the SQL used is:

                         "CASE WHEN a.attcollation <> t.typcollation "
                         "THEN a.attcollation ELSE 0 END AS attcollation,\n"

Also, should \dV or \dV+ show the collation?

And a few comments on the new chunks in this version of the patch (I didn't
look in detail at the whole patch yet):

+   <para>
+    The session variables can be overshadowed by columns in an query. When query
+    holds identifier or qualified identifier that can be used as session variable
+    identifier and as column identifier too, then it is used as column identifier
+    every time. This situation can be logged by enabling configuration
+    parameter <xref linkend="guc-session-variables-ambiguity-warning"/>.
+   </para>

Is "overshadowed" correct?  The rest of the patch only says "shadow(ed)".

While at it, here's some proposition to improve the phrasing:

+  The session variables can be shadowed by column references in a query. When a
+  query contains identifiers or qualified identifiers that could be used as both
+  a session variable identifiers and as column identifier, then the column
+  identifier is preferred every time. Warnings can be emitted when this situation
+  happens by enabling configuration parameter <xref
+  linkend="guc-session-variables-ambiguity-warning"/>.

Similarly, the next documentation could be rephrased to:

+ When on, a warning is raised when any identifier in a query could be used as both
+ a column identifier or a session variable identifier.
+ The default is <literal>off</literal>.


Few other nitpicking:

+            * If we really detect collision of column and variable identifier,
+            * then we prefer column, because we don't want to allow to break
+            * an existing valid queries by new variable.

s/an existing/existing

+-- it is ambigonuous, but columns are preferred

ambiguous?


@@ -369,6 +367,19 @@ VariableCreate(const char *varName,
    /* dependency on extension */
    recordDependencyOnCurrentExtension(&myself, false);

+   /*
+    * Normal dependency from a domain to its collation.  We know the default
+    * collation is pinned, so don't bother recording it.
+    */
+   if (OidIsValid(varCollation) &&
+       varCollation != DEFAULT_COLLATION_OID)

The comment mentions domains rather than session variables.

And for the initial patch, while looking around I found this comment on
fix_alternative_subplan():

@@ -1866,7 +1969,9 @@ fix_alternative_subplan(PlannerInfo *root, AlternativeSubPlan *asplan,
  * replacing Aggref nodes that should be replaced by initplan output Params,
  * choosing the best implementation for AlternativeSubPlans,
  * looking up operator opcode info for OpExpr and related nodes,
- * and adding OIDs from regclass Const nodes into root->glob->relationOids.
+ * and adding OIDs from regclass Const nodes into root->glob->relationOids,
+ * and replacing PARAM_VARIABLE paramid, that is the oid of the session variable
+ * to offset the array by query used session variables. ???

I don't really understand the comment, and the "???" looks a bit suspicious.
I'm assuming it's a reference to this new behavior in fix_param_node():

  * fix_param_node
  *     Do set_plan_references processing on a Param
+ *     Collect session variables list and replace variable oid by
+ *     index to collected list.
  *
  * If it's a PARAM_MULTIEXPR, replace it with the appropriate Param from
  * root->multiexpr_params; otherwise no change is needed.
  * Just for paranoia's sake, we make a copy of the node in either case.
+ *
+ * If it's a PARAM_VARIABLE, then we should to calculate paramid.

Some improvement on the comments would be welcome there, probably including
some mention to the "glob->sessionVariables" collected list?



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 19. 1. 2022 v 9:01 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Tue, Jan 18, 2022 at 10:01:01PM +0100, Pavel Stehule wrote:
> pá 14. 1. 2022 v 3:44 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> >
> > =# let myvariable = 'AA';
> > LET
> >
> > =# select 'AA' collate "en-x-icu" < myvariable;
> >  ?column?
> > ----------
> >  f
> > (1 row)
> >
> > =# select 'AA' collate "en-x-icu" < myvariable collate mycollation;
> > ERROR:  42P21: collation mismatch between explicit collations "en-x-icu"
> > and "mycollation"
> > LINE 1: select 'AA' collate "en-x-icu" < myvariable collate mycollat...
> >
>
> What do you expect?  I don't understand collating well, but it looks
> correct. Minimally the tables have the same behavior.

Indeed, I actually didn't know that such object's collation were implicit and
could be overloaded without a problem as long as there's no conflict between
all the explicit collations.  So I agree that the current behavior is ok,
including a correct handling for wanted conflicts:

=# create variable var1 text collate "fr-x-icu";
CREATE VARIABLE

=# create variable var2 text collate "en-x-icu";
CREATE VARIABLE

=# let var1 = 'hoho';
LET

=# let var2 = 'hoho';
LET

=# select var1 < var2;
ERROR:  42P22: could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.

> Please, can you check the attached patches?

All the issue I mentioned are fixed, thanks!


thank you for check
 

I see a few problems with the other new features added though.  The new
session_variables_ambiguity_warning GUC is called even in contexts where it
shouldn't apply.  For instance:

=# set session_variables_ambiguity_warning = 1;
SET

=# create variable v text;
CREATE VARIABLE

=# DO $$
DECLARE v text;
BEGIN
v := 'test';
RAISE NOTICE 'v: %', v;
END;
$$ LANGUAGE plpgsql;
WARNING:  42702: session variable "v" is shadowed by column
LINE 1: v := 'test'
        ^
DETAIL:  The identifier can be column reference or session variable reference.
HINT:  The column reference is preferred against session variable reference.
QUERY:  v := 'test'

But this "v := 'test'" shouldn't be a substitute for a LET, and it indeed
doesn't work:

Yes, there are some mistakes (bugs). The PLpgSQL assignment as target should not see session variables, so warning is nonsense there. RAISE NOTICE should use local variables, and in this case is a question if we should raise a warning. There are two possible analogies - we can see session variables like global variables, and then the warning should not be raised, or we can see relation between session variables and plpgsql variables similar like session variables and some with higher priority, and then warning should be raised. If we want to ensure that the new session variable doesn't break code, then session variables should have lower priority than plpgsql variables too. And because the plpgsql protection against collision cannot  be used, then I prefer raising the warning. 

PLpgSQL assignment should not be in collision with session variables ever

=# DO $$
BEGIN
v := 'test';
RAISE NOTICE 'v: %', v;
END;
$$ LANGUAGE plpgsql;
ERROR:  42601: "v" is not a known variable
LINE 3: v := 'test';

But the RAISE NOTICE does see the session variable (which should be the correct
behavior I think), so the warning should have been raised for this instruction
(and in that case the message is incorrect, as it's not shadowing a column).

In this case I can detect node type, and I can identify external param node, but I cannot to detect if this code was executed from PLpgSQL or from some other

So I can to modify warning text to some

DETAIL:  The identifier can be column reference or query parameter or session variable reference.
HINT:  The column reference and query parameter is preferred against session variable reference.

I cannot to use term "plpgsql variable" becase I cannot to ensure validity of this message

Maybe is better to don't talk about source of this issue, and just talk about result - so the warning text should be just

MESSAGE: "session variable \"xxxx\" is shadowed
DETAIL: "session variables can be shadowed by columns, routine's variables and routine's arguments with same name"

Is it better?

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Wed, Jan 19, 2022 at 09:09:41PM +0100, Pavel Stehule wrote:
> st 19. 1. 2022 v 9:01 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> 
> RAISE NOTICE should use local variables, and in this case is a question if we
> should raise a warning. There are two possible analogies - we can see session
> variables like global variables, and then the warning should not be raised,
> or we can see relation between session variables and plpgsql variables
> similar like session variables and some with higher priority, and then
> warning should be raised. If we want to ensure that the new session variable
> doesn't break code, then session variables should have lower priority than
> plpgsql variables too. And because the plpgsql protection against collision
> cannot  be used, then I prefer raising the warning.

Ah that's indeed a good point.  I agree, they're from a different part of the
system so they should be treated as different things, and thus raising a
warning.  It's consistent with the chosen conservative approach anyway.

> PLpgSQL assignment should not be in collision with session variables ever

Agreed.

> 
> >
> > =# DO $$
> > BEGIN
> > v := 'test';
> > RAISE NOTICE 'v: %', v;
> > END;
> > $$ LANGUAGE plpgsql;
> > ERROR:  42601: "v" is not a known variable
> > LINE 3: v := 'test';
> >
> > But the RAISE NOTICE does see the session variable (which should be the
> > correct
> > behavior I think), so the warning should have been raised for this
> > instruction
> > (and in that case the message is incorrect, as it's not shadowing a
> > column).
> >
> 
> In this case I can detect node type, and I can identify external param
> node, but I cannot to detect if this code was executed from PLpgSQL or from
> some other
> 
> So I can to modify warning text to some

Yes, that's what I had in mind too.

> DETAIL:  The identifier can be column reference or query parameter or
> session variable reference.
> HINT:  The column reference and query parameter is preferred against
> session variable reference.
> 
> I cannot to use term "plpgsql variable" becase I cannot to ensure validity
> of this message
> 
> Maybe is better to don't talk about source of this issue, and just talk
> about result - so the warning text should be just
> 
> MESSAGE: "session variable \"xxxx\" is shadowed
> DETAIL: "session variables can be shadowed by columns, routine's variables
> and routine's arguments with same name"
> 
> Is it better?

I clearly prefer the 2nd version.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

st 19. 1. 2022 v 9:01 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Tue, Jan 18, 2022 at 10:01:01PM +0100, Pavel Stehule wrote:
> pá 14. 1. 2022 v 3:44 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> >
> > =# let myvariable = 'AA';
> > LET
> >
> > =# select 'AA' collate "en-x-icu" < myvariable;
> >  ?column?
> > ----------
> >  f
> > (1 row)
> >
> > =# select 'AA' collate "en-x-icu" < myvariable collate mycollation;
> > ERROR:  42P21: collation mismatch between explicit collations "en-x-icu"
> > and "mycollation"
> > LINE 1: select 'AA' collate "en-x-icu" < myvariable collate mycollat...
> >
>
> What do you expect?  I don't understand collating well, but it looks
> correct. Minimally the tables have the same behavior.

Indeed, I actually didn't know that such object's collation were implicit and
could be overloaded without a problem as long as there's no conflict between
all the explicit collations.  So I agree that the current behavior is ok,
including a correct handling for wanted conflicts:

=# create variable var1 text collate "fr-x-icu";
CREATE VARIABLE

=# create variable var2 text collate "en-x-icu";
CREATE VARIABLE

=# let var1 = 'hoho';
LET

=# let var2 = 'hoho';
LET

=# select var1 < var2;
ERROR:  42P22: could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.

> Please, can you check the attached patches?

All the issue I mentioned are fixed, thanks!


I see a few problems with the other new features added though.  The new
session_variables_ambiguity_warning GUC is called even in contexts where it
shouldn't apply.  For instance:

=# set session_variables_ambiguity_warning = 1;
SET

=# create variable v text;
CREATE VARIABLE

=# DO $$
DECLARE v text;
BEGIN
v := 'test';
RAISE NOTICE 'v: %', v;
END;
$$ LANGUAGE plpgsql;
WARNING:  42702: session variable "v" is shadowed by column
LINE 1: v := 'test'
        ^
DETAIL:  The identifier can be column reference or session variable reference.
HINT:  The column reference is preferred against session variable reference.
QUERY:  v := 'test'

But this "v := 'test'" shouldn't be a substitute for a LET, and it indeed
doesn't work:

=# DO $$
BEGIN
v := 'test';
RAISE NOTICE 'v: %', v;
END;
$$ LANGUAGE plpgsql;
ERROR:  42601: "v" is not a known variable
LINE 3: v := 'test';

fixed
 

But the RAISE NOTICE does see the session variable (which should be the correct
behavior I think), so the warning should have been raised for this instruction
(and in that case the message is incorrect, as it's not shadowing a column).

Also, the pg_dump handling emits a COLLATION option for session variables even
for default collation, while it should only emit it if the collation is not the
type's default collation.  As a reference, for attributes the SQL used is:

                                                 "CASE WHEN a.attcollation <> t.typcollation "
                                                 "THEN a.attcollation ELSE 0 END AS attcollation,\n"

Isn't it a different issue? I don't see filtering DEFAULT_COLLATION_OID in pg_dump code. But this case protects against a redundant COLLATE clause, and for consistency, this check should be done for variables too.

<-->/*
<--> * Find all the user attributes and their types.
<--> *
<--> * Since we only want to dump COLLATE clauses for attributes whose
<--> * collation is different from their type's default, we use a CASE here to
<--> * suppress uninteresting attcollations cheaply.
<--> */

fixed

 

Also, should \dV or \dV+ show the collation?

I did it for \dV
 

And a few comments on the new chunks in this version of the patch (I didn't
look in detail at the whole patch yet):

+   <para>
+    The session variables can be overshadowed by columns in an query. When query
+    holds identifier or qualified identifier that can be used as session variable
+    identifier and as column identifier too, then it is used as column identifier
+    every time. This situation can be logged by enabling configuration
+    parameter <xref linkend="guc-session-variables-ambiguity-warning"/>.
+   </para>

Is "overshadowed" correct?  The rest of the patch only says "shadow(ed)".

While at it, here's some proposition to improve the phrasing:

+  The session variables can be shadowed by column references in a query. When a
+  query contains identifiers or qualified identifiers that could be used as both
+  a session variable identifiers and as column identifier, then the column
+  identifier is preferred every time. Warnings can be emitted when this situation
+  happens by enabling configuration parameter <xref
+  linkend="guc-session-variables-ambiguity-warning"/>.

Similarly, the next documentation could be rephrased to:

+ When on, a warning is raised when any identifier in a query could be used as both
+ a column identifier or a session variable identifier.
+ The default is <literal>off</literal>.


changed
 

Few other nitpicking:

+            * If we really detect collision of column and variable identifier,
+            * then we prefer column, because we don't want to allow to break
+            * an existing valid queries by new variable.

s/an existing/existing

refactorized
 

+-- it is ambigonuous, but columns are preferred

ambiguous?

fixed
 


@@ -369,6 +367,19 @@ VariableCreate(const char *varName,
    /* dependency on extension */
    recordDependencyOnCurrentExtension(&myself, false);

+   /*
+    * Normal dependency from a domain to its collation.  We know the default
+    * collation is pinned, so don't bother recording it.
+    */
+   if (OidIsValid(varCollation) &&
+       varCollation != DEFAULT_COLLATION_OID)

The comment mentions domains rather than session variables.


fixed
 
And for the initial patch, while looking around I found this comment on
fix_alternative_subplan():

this is little bit strange - modified function is fix_scan_expr

@@ -1866,7 +1969,9 @@ fix_alternative_subplan(PlannerInfo *root, AlternativeSubPlan *asplan,
  * replacing Aggref nodes that should be replaced by initplan output Params,
  * choosing the best implementation for AlternativeSubPlans,
  * looking up operator opcode info for OpExpr and related nodes,
- * and adding OIDs from regclass Const nodes into root->glob->relationOids.
+ * and adding OIDs from regclass Const nodes into root->glob->relationOids,
+ * and replacing PARAM_VARIABLE paramid, that is the oid of the session variable
+ * to offset the array by query used session variables. ???

I don't really understand the comment, and the "???" looks a bit suspicious.
I'm assuming it's a reference to this new behavior in fix_param_node():

yes, I modified this comment
 

  * fix_param_node
  *     Do set_plan_references processing on a Param
+ *     Collect session variables list and replace variable oid by
+ *     index to collected list.
  *
  * If it's a PARAM_MULTIEXPR, replace it with the appropriate Param from
  * root->multiexpr_params; otherwise no change is needed.
  * Just for paranoia's sake, we make a copy of the node in either case.
+ *
+ * If it's a PARAM_VARIABLE, then we should to calculate paramid.

Some improvement on the comments would be welcome there, probably including
some mention to the "glob->sessionVariables" collected list?

done

I am sending updated patches

Regards

Pavel

 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Fri, Jan 21, 2022 at 09:23:34PM +0100, Pavel Stehule wrote:
> 
> st 19. 1. 2022 v 9:01 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> >
> > Also, the pg_dump handling emits a COLLATION option for session variables
> > even
> > for default collation, while it should only emit it if the collation is
> > not the
> > type's default collation.  As a reference, for attributes the SQL used is:
> >
> >                                                  "CASE WHEN a.attcollation
> > <> t.typcollation "
> >                                                  "THEN a.attcollation ELSE
> > 0 END AS attcollation,\n"
> >
> 
> Isn't it a different issue? I don't see filtering DEFAULT_COLLATION_OID in
> pg_dump code. But this case protects against a redundant COLLATE clause,
> and for consistency, this check should be done for variables too.

Yes, sorry my message was a bit ambiguous as for all native collatable types
the "default" collation is the type's default collation, I thought that the
code extract would make it clear enough.

In any case your fix is exactly what I had in mind so it's perfect, thanks!

> I am sending updated patches

Thanks a lot!  I will try to review them over the weekend.



Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Fri, Jan 21, 2022 at 09:23:34PM +0100, Pavel Stehule wrote:
> 
> I am sending updated patches

I've been looking a bit deeper at the feature and I noticed that there's no
locking involved around the session variable usage, and I don't think that's
ok.  AFAICS any variable used in a session will be cached in the local hash
table and will never try to access some catalog or cache, so I don't have any
naive scenario that would immediately crash, but this has some other
implications that seems debatable.

For instance, right now nothing prevents a variable from being dropped while
another session is using it.

Obviously we can't lock a session variable forever just because a session
assigned a value once ages ago, especially outside of the current transaction.
But if a session set a variable in the local transaction, I don't think that
it's ok to have a subsequent query failing because someone else concurrently
dropped the variable.

I only backlogged this current thread but I didn't see that being discussed.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 23. 1. 2022 v 9:10 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Fri, Jan 21, 2022 at 09:23:34PM +0100, Pavel Stehule wrote:
>
> I am sending updated patches

I've been looking a bit deeper at the feature and I noticed that there's no
locking involved around the session variable usage, and I don't think that's
ok.  AFAICS any variable used in a session will be cached in the local hash
table and will never try to access some catalog or cache, so I don't have any
naive scenario that would immediately crash, but this has some other
implications that seems debatable.

For instance, right now nothing prevents a variable from being dropped while
another session is using it.

Obviously we can't lock a session variable forever just because a session
assigned a value once ages ago, especially outside of the current transaction.
But if a session set a variable in the local transaction, I don't think that
it's ok to have a subsequent query failing because someone else concurrently
dropped the variable.

I only backlogged this current thread but I didn't see that being discussed.

Isn't there enough stability of the system cache? sinval is sent at the moment when changes in the system catalog are visible. So inside query execution I don't see that the variable was dropped in another session.



 

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Sun, Jan 23, 2022 at 09:25:56AM +0100, Pavel Stehule wrote:
> ne 23. 1. 2022 v 9:10 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> 
> Isn't there enough stability of the system cache? sinval is sent at the
> moment when changes in the system catalog are visible. So inside query
> execution I don't see that the variable was dropped in another session.

Yes, inside a single query it should probably be ok, but I'm talking about
multiple query execution in the same transaction.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

ne 23. 1. 2022 v 9:52 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Sun, Jan 23, 2022 at 09:25:56AM +0100, Pavel Stehule wrote:
> ne 23. 1. 2022 v 9:10 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> Isn't there enough stability of the system cache? sinval is sent at the
> moment when changes in the system catalog are visible. So inside query
> execution I don't see that the variable was dropped in another session.

Yes, inside a single query it should probably be ok, but I'm talking about
multiple query execution in the same transaction.

I tested it now. a sinval message is waiting on the transaction end.  So when a variable is used, then it is working fine until the transaction ends.
But when the session makes some DDL, then send sinval to self, and at this moment, the variable can be dropped before the transaction ends.

So to be safe, the lock is required. I'll do it tomorrow.

Regards

Pavel





Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi, 

Le dim. 23 janv. 2022 à 22:34, Pavel Stehule <pavel.stehule@gmail.com> a écrit :
I tested it now. a sinval message is waiting on the transaction end.  So when a variable is used, then it is working fine until the transaction ends.
But when the session makes some DDL, then send sinval to self, and at this moment, the variable can be dropped before the transaction ends.

a backend can accept sinval in very common scenarios, like acquiring a heavyweight lock. That includes accessing a relation thats not in the catcache, so that's really critical to have a protection here. 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

ne 23. 1. 2022 v 16:06 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi, 

Le dim. 23 janv. 2022 à 22:34, Pavel Stehule <pavel.stehule@gmail.com> a écrit :
I tested it now. a sinval message is waiting on the transaction end.  So when a variable is used, then it is working fine until the transaction ends.
But when the session makes some DDL, then send sinval to self, and at this moment, the variable can be dropped before the transaction ends.

a backend can accept sinval in very common scenarios, like acquiring a heavyweight lock. That includes accessing a relation thats not in the catcache, so that's really critical to have a protection here. 

here is updated patch with locking support

Regards

Pavel
 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Mon, Jan 24, 2022 at 12:33:11PM +0100, Pavel Stehule wrote:
> 
> here is updated patch with locking support

Thanks for updating the patch!

While the locking is globally working as intended, I found a few problems with
it.

First, I don't think that acquiring the lock in
get_session_variable_type_typmod_collid() and prepare_variable_for_reading() is
the correct approach.  In transformColumnRef() and transformLetStmt() you first
call IdentifyVariable() to check if the given name is a variable without
locking it and later try to lock the variable if you get a valid Oid.  This is
bug prone as any other backend could drop the variable between the two calls
and you would end up with a cache lookup failure.  I think the lock should be
acquired during IdentifyVariable.  It should probably be optional as one
codepath only needs the information to raise a warning when a variable is
shadowed, so a concurrent drop isn't a problem there.

For prepare_variable_for_reading(), the callers are CopySessionVariable() and
GetSessionVariable().  IIUC those should take care of executor-time locks, but
shouldn't there be some changes for planning, like in AcquirePlannerLocks()?

Some other comments on this part of the patch:

@@ -717,6 +730,9 @@ RemoveSessionVariable(Oid varid)
    Relation    rel;
    HeapTuple   tup;

+   /* Wait, when dropped variable is not used */
+   LockDatabaseObject(VariableRelationId, varid, 0, AccessExclusiveLock);

Why do you explicitly try to acquire an AEL on the variable here?
RemoveObjects / get_object_address should guarantee that this was already done.
You could add an assert LockHeldByMe() here, but no other code path do it so it
would probably waste cycles in assert builds for nothing as it's a fundamental
guarantee.


@@ -747,6 +763,9 @@ RemoveSessionVariable(Oid varid)
     * only when current transaction will be commited.
     */
    register_session_variable_xact_action(varid, ON_COMMIT_RESET);
+
+   /* Release lock */
+   UnlockDatabaseObject(VariableRelationId, varid, 0, AccessExclusiveLock);
 }

Why releasing the lock here?  It will be done at the end of the transaction,
and you certainly don't want other backends to start using this variable in
between.  Also, since you acquired the lock a second time it only decreases the
lock count in the locallock so the lock isn't released anyway.

+ * Returns type, typmod and collid of session variable.
+ *
+ * As a side effect this function acquires AccessShareLock on the
+ * related session variable.
  */
 void
-get_session_variable_type_typmod_collid(Oid varid, Oid *typid, int32 *typmod, Oid *collid)
+get_session_variable_type_typmod_collid(Oid varid, Oid *typid, int32 *typmod, Oid *collid,
+                                       bool lock_held)


lock_held is a bit misleading.  If you keep some similar parameter for this or
another function, maybe name it lock_it or something like that instead?

Also, the comment isn't accurate and should say that an ASL is acquired iff the
variable is true.

+   /*
+    * Acquire a lock on session variable, which we won't release until commit.
+    * This ensure that one backend cannot to drop session variable used by
+    * second backend.
+    */

(and similar comments)
I don't think it's necessary to explain why we acquire locks, we should just
say that the lock will be kept for the whole transaction (and not until a
commit)

And while looking at nearby code, it's probably worthwhile to add an Assert in
create_sessionvars_hashtable() to validate that sessionvars htab is NULL.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


út 25. 1. 2022 v 6:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Mon, Jan 24, 2022 at 12:33:11PM +0100, Pavel Stehule wrote:
>
> here is updated patch with locking support

Thanks for updating the patch!

While the locking is globally working as intended, I found a few problems with
it.

First, I don't think that acquiring the lock in
get_session_variable_type_typmod_collid() and prepare_variable_for_reading() is
the correct approach.  In transformColumnRef() and transformLetStmt() you first
call IdentifyVariable() to check if the given name is a variable without
locking it and later try to lock the variable if you get a valid Oid.  This is
bug prone as any other backend could drop the variable between the two calls
and you would end up with a cache lookup failure.  I think the lock should be
acquired during IdentifyVariable.  It should probably be optional as one
codepath only needs the information to raise a warning when a variable is
shadowed, so a concurrent drop isn't a problem there.

There is a problem, because before the IdentifyVariable call I don't know if the variable will be shadowed or not.

If I lock a variable inside IdentifyVariable, then I need to remember if I did lock there, or if the variable was locked already, and If the variable is shadowed and if lock is fresh, then I can unlock the variable.


Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Tue, Jan 25, 2022 at 09:35:09AM +0100, Pavel Stehule wrote:
> út 25. 1. 2022 v 6:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> 
> > I think the lock should be
> > acquired during IdentifyVariable.  It should probably be optional as one
> > codepath only needs the information to raise a warning when a variable is
> > shadowed, so a concurrent drop isn't a problem there.
> >
> 
> There is a problem, because before the IdentifyVariable call I don't know
> if the variable will be shadowed or not.
> 
> If I lock a variable inside IdentifyVariable, then I need to remember if I
> did lock there, or if the variable was locked already, and If the variable
> is shadowed and if lock is fresh, then I can unlock the variable.

But in transformColumnRef() you already know if you found a matching column or
not when calling IdentifyVariable(), so you know if an existing variable will
shadow it right?

Couldn't you call something like

    lockit = node == NULL;
    varid = IdentifyVariable(cref->fields, &attrname, ¬_unique, lockit);

The only other caller is transformLetStmt(), which should always lock the
variable anyway.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


út 25. 1. 2022 v 9:48 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Tue, Jan 25, 2022 at 09:35:09AM +0100, Pavel Stehule wrote:
> út 25. 1. 2022 v 6:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> > I think the lock should be
> > acquired during IdentifyVariable.  It should probably be optional as one
> > codepath only needs the information to raise a warning when a variable is
> > shadowed, so a concurrent drop isn't a problem there.
> >
>
> There is a problem, because before the IdentifyVariable call I don't know
> if the variable will be shadowed or not.
>
> If I lock a variable inside IdentifyVariable, then I need to remember if I
> did lock there, or if the variable was locked already, and If the variable
> is shadowed and if lock is fresh, then I can unlock the variable.

But in transformColumnRef() you already know if you found a matching column or
not when calling IdentifyVariable(), so you know if an existing variable will
shadow it right?

yes, you have true,

Thank you

 

Couldn't you call something like

    lockit = node == NULL;
        varid = IdentifyVariable(cref->fields, &attrname, &not_unique, lockit);

The only other caller is transformLetStmt(), which should always lock the
variable anyway.

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

út 25. 1. 2022 v 6:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Mon, Jan 24, 2022 at 12:33:11PM +0100, Pavel Stehule wrote:
>
> here is updated patch with locking support

Thanks for updating the patch!

While the locking is globally working as intended, I found a few problems with
it.

First, I don't think that acquiring the lock in
get_session_variable_type_typmod_collid() and prepare_variable_for_reading() is
the correct approach.  In transformColumnRef() and transformLetStmt() you first
call IdentifyVariable() to check if the given name is a variable without
locking it and later try to lock the variable if you get a valid Oid.  This is
bug prone as any other backend could drop the variable between the two calls
and you would end up with a cache lookup failure.  I think the lock should be
acquired during IdentifyVariable.  It should probably be optional as one
codepath only needs the information to raise a warning when a variable is
shadowed, so a concurrent drop isn't a problem there.

I moved lock to IdentifyVariable routine
 

For prepare_variable_for_reading(), the callers are CopySessionVariable() and
GetSessionVariable().  IIUC those should take care of executor-time locks, but
shouldn't there be some changes for planning, like in AcquirePlannerLocks()?

done
 

Some other comments on this part of the patch:

@@ -717,6 +730,9 @@ RemoveSessionVariable(Oid varid)
    Relation    rel;
    HeapTuple   tup;

+   /* Wait, when dropped variable is not used */
+   LockDatabaseObject(VariableRelationId, varid, 0, AccessExclusiveLock);

Why do you explicitly try to acquire an AEL on the variable here?
RemoveObjects / get_object_address should guarantee that this was already done.
You could add an assert LockHeldByMe() here, but no other code path do it so it
would probably waste cycles in assert builds for nothing as it's a fundamental
guarantee.


removed
 

@@ -747,6 +763,9 @@ RemoveSessionVariable(Oid varid)
     * only when current transaction will be commited.
     */
    register_session_variable_xact_action(varid, ON_COMMIT_RESET);
+
+   /* Release lock */
+   UnlockDatabaseObject(VariableRelationId, varid, 0, AccessExclusiveLock);
 }

Why releasing the lock here?  It will be done at the end of the transaction,
and you certainly don't want other backends to start using this variable in
between.  Also, since you acquired the lock a second time it only decreases the
lock count in the locallock so the lock isn't released anyway.


 removed

+ * Returns type, typmod and collid of session variable.
+ *
+ * As a side effect this function acquires AccessShareLock on the
+ * related session variable.
  */
 void
-get_session_variable_type_typmod_collid(Oid varid, Oid *typid, int32 *typmod, Oid *collid)
+get_session_variable_type_typmod_collid(Oid varid, Oid *typid, int32 *typmod, Oid *collid,
+                                       bool lock_held)


lock_held is a bit misleading.  If you keep some similar parameter for this or
another function, maybe name it lock_it or something like that instead?

Also, the comment isn't accurate and should say that an ASL is acquired iff the
variable is true.

removed

 

+   /*
+    * Acquire a lock on session variable, which we won't release until commit.
+    * This ensure that one backend cannot to drop session variable used by
+    * second backend.
+    */

(and similar comments)
I don't think it's necessary to explain why we acquire locks, we should just
say that the lock will be kept for the whole transaction (and not until a
commit)

removed
 

And while looking at nearby code, it's probably worthwhile to add an Assert in
create_sessionvars_hashtable() to validate that sessionvars htab is NULL.

done

attached updated patch

Regards

Pavel
 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Tue, Jan 25, 2022 at 10:53:00PM +0100, Pavel Stehule wrote:
> 
> út 25. 1. 2022 v 6:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> >
> > First, I don't think that acquiring the lock in
> > get_session_variable_type_typmod_collid() and
> > prepare_variable_for_reading() is
> > the correct approach.  In transformColumnRef() and transformLetStmt() you
> > first
> > call IdentifyVariable() to check if the given name is a variable without
> > locking it and later try to lock the variable if you get a valid Oid.
> > This is
> > bug prone as any other backend could drop the variable between the two
> > calls
> > and you would end up with a cache lookup failure.  I think the lock should
> > be
> > acquired during IdentifyVariable.  It should probably be optional as one
> > codepath only needs the information to raise a warning when a variable is
> > shadowed, so a concurrent drop isn't a problem there.
> >
> 
> I moved lock to IdentifyVariable routine

+IdentifyVariable(List *names, char **attrname, bool lockit, bool *not_unique)
+{
[...]
+               return varoid_without_attr;
+           }
+           else
+           {
+               *attrname = c;
+               return varoid_with_attr;
[...]
+
+   if (OidIsValid(varid) && lockit)
+       LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
+
+   return varid;

There are still some code paths that may not lock the target variable when
required.

Also, the function comment doesn't say much about attrname handling, it should
be clarifed.  I think it should initially be set to NULL, to make sure that
it's always a valid pointer after the function returns.


> attached updated patch

Various comments on the patch:

No test for GRANT/REVOKE ... ALL VARIABLES IN SCHEMA, maybe it would be good to
have one?

Documentation:

catalogs.sgml:

You're still using the old-style 4 columns table, it should be a single column
like the rest of the file.

+  <para>
+   The <command>CREATE VARIABLE</command> command creates a session variable.
+   Session variables, like relations, exist within a schema and their access is
+   controlled via <command>GRANT</command> and <command>REVOKE</command>
+   commands.  Changing a session variable is non-transactional.
+  </para>

The "changing a session variable is non-transactional" is ambiguous.  I think
that only the value part isn't transactional, the variable metadata themselves
(ALTER VARIABLE and other DDL) are transactional right?  This should be
explicitly described here (although it's made less ambiguous in the next
paragraph).

+  <para>
+   Session variables are retrieved by the <command>SELECT</command> SQL
+   command.  Their value is set with the <command>LET</command> SQL command.
+   While session variables share properties with tables, their value cannot be
+   updated with an <command>UPDATE</command> command.
+  </para>

should this part mention that session variables can be shadowed?  For now the
only mention to that is in advanced.sgml.

+      The <literal>DEFAULT</literal> clause can be used to assign a default
+      value to a session variable.

The expression is lazily evaluated during the session first use of the
variable.  This should be documented as any usage of volatile expression will
be impacted.

+      The <literal>ON TRANSACTION END RESET</literal>
+      clause causes the session variable to be reset to its default value when
+      the transaction is committed or rolled back.

As far as I can see this clauses doesn't play well with IMMUTABLE VARIABLE, as
you can reassign a value once the transaction ends.  Same for DISCARD [ ALL |
VARIABLES ], or LET var = NULL (or DEFAULT if no default value).  Is that
intended?

+   <literal>LET</literal> extends the syntax defined in the SQL
+   standard. The <literal>SET</literal> command from the SQL standard
+   is used for different purposes in <productname>PostgreSQL</productname>.

I don't fully understand that.  Are (session) variables defined in the SQL
standard?  If yes, all the other documentation pages should clarify that as
they currently say that this is a postgres extension.  If not, this part should
made it clear what is defined in the standard.

In revoke.sgml:
+ REVOKE [ GRANT OPTION FOR ]
+     { { READ | WRITE } [, ...] | ALL [ PRIVILEGES ] }
+     ON VARIABLE <replaceable>variable_name</replaceable> [, ...]
+     FROM { [ GROUP ] <replaceable class="parameter">role_name</replaceable> | PUBLIC } [, ...]
+     [ CASCADE | RESTRICT ]

there's no extra documentation for that, and therefore no clarification on
variable_name.

VariableIsVisible():
+         * If it is in the path, it might still not be visible; it could be
+         * hidden by another relation of the same name earlier in the path. So
+         * we must do a slow check for conflicting relations.

should it be "another variable of the same name"?


Tab completion: CREATE IMMUTABLE VARIABLE is not handled


pg_variable.c:
Do we really need both session_variable_get_name() and
get_session_variable_name()?

+/*
+ * Fetch all fields of session variable from the syscache.
+ */
+void
+initVariable(Variable *var, Oid varid, bool missing_ok, bool fast_only)

As least fast_only should be documented in the function comment, especially
regarding var->varname, since:

+   var->oid = varid;
+   var->name = pstrdup(NameStr(varform->varname));
[...]
+   if (!fast_only)
+   {
+       Datum       aclDatum;
+       bool        isnull;
+
+       /* name */
+       var->name = pstrdup(NameStr(varform->varname));A
[...]
+   else
+   {
+       var->name = NULL;

is the output value guaranteed or not?  In any case it shouldn't be set twice.

Also, I don't see any caller for missing_ok == true, should we remove it?

+/*
+ * Create entry in pg_variable table
+ */
+ObjectAddress
+VariableCreate(const char *varName,
[...]
+   /* dependency on any roles mentioned in ACL */
+   if (varacl != NULL)
+   {
+       int         nnewmembers;
+       Oid        *newmembers;
+
+       nnewmembers = aclmembers(varacl, &newmembers);
+       updateAclDependencies(VariableRelationId, varid, 0,
+                             varOwner,
+                             0, NULL,
+                             nnewmembers, newmembers);

Shouldn't you use recordDependencyOnNewAcl() instead?  Also, sn't it missing a
recordDependencyOnOwner()?

sessionvariable.c:

+ * Although session variables are not transactional, we don't
+ * want (and we cannot) to run cleaning immediately (when we
+ * got sinval message). The value of session variables can
+ * be still used or the operation that emits cleaning can be
+ * reverted. Unfortunatelly, this check can be done only in
+ * when transaction is committed (the check against system
+ * catalog requires transaction state).

This was the original idea, but since there's now locking to make all DDL safe,
the metadata should be considered fully transactional and no session should
still be able to use a concurrently dropped variable.  Also, the invalidation
messages are not sent until the transaction is committed.  So is that approach
still needed (at least for things outside ON COMMIT DROP / ON TRANSACTION END
RESET)?

I'm also attaching a 3rd patch with some proposition for documentation
rewording (including consistent use of *session* variable), a few comments
rewording, copyright year bump and minor things like that.

Note that I still didn't really review pg_variable.c or sessionvariable.c since
there might be significant changes there for either the sinval / immutable part
I mentioned.

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:

sessionvariable.c:

+ * Although session variables are not transactional, we don't
+ * want (and we cannot) to run cleaning immediately (when we
+ * got sinval message). The value of session variables can
+ * be still used or the operation that emits cleaning can be
+ * reverted. Unfortunatelly, this check can be done only in
+ * when transaction is committed (the check against system
+ * catalog requires transaction state).

This was the original idea, but since there's now locking to make all DDL safe,
the metadata should be considered fully transactional and no session should
still be able to use a concurrently dropped variable.  Also, the invalidation
messages are not sent until the transaction is committed.  So is that approach
still needed (at least for things outside ON COMMIT DROP / ON TRANSACTION END
RESET

I think this is still necessary. The lock protects the variable against drop from the second session, but not for reverted deletion from the current session.

This implementation is due Tomas's request for

CREATE VARIABLE xx AS int;
LET xx = 100;
BEGIN;
DROP VARIABLE xx;
ROLLBACK;
SELECT xx; --> 100

and the variable still holds the last value before DROP

Personally, this is a corner case (for me, and I think so for users it is not too interesting, and important),  and this behavior is not necessary - originally I implemented just the RESET variable in this case. On the other hand, this is a nice feature, and there is an analogy with TRUNCATE behavior.

More, I promised, as a second step, implementation of optional transactional behavior of session variables. And related code is necessary for it. So I prefer to use related code without change.

Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Wed, Jan 26, 2022 at 02:43:54PM +0100, Pavel Stehule wrote:
> 
> I think this is still necessary. The lock protects the variable against
> drop from the second session, but not for reverted deletion from the
> current session.
> 
> This implementation is due Tomas's request for
> 
> CREATE VARIABLE xx AS int;
> LET xx = 100;
> BEGIN;
> DROP VARIABLE xx;
> ROLLBACK;
> SELECT xx; --> 100
> 
> and the variable still holds the last value before DROP

I thought about this case, but assumed that the own session wouldn't process
the inval until commit. Agreed then, although the comment should clarify the
transactional behavior and why it's still necessary.

> Personally, this is a corner case (for me, and I think so for users it is
> not too interesting, and important),  and this behavior is not necessary -
> originally I implemented just the RESET variable in this case. On the other
> hand, this is a nice feature, and there is an analogy with TRUNCATE
> behavior.
> 
> More, I promised, as a second step, implementation of optional
> transactional behavior of session variables. And related code is necessary
> for it. So I prefer to use related code without change.

That's another good reason, so fine by me!



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 26. 1. 2022 v 8:23 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Tue, Jan 25, 2022 at 10:53:00PM +0100, Pavel Stehule wrote:
>
> út 25. 1. 2022 v 6:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> >
> > First, I don't think that acquiring the lock in
> > get_session_variable_type_typmod_collid() and
> > prepare_variable_for_reading() is
> > the correct approach.  In transformColumnRef() and transformLetStmt() you
> > first
> > call IdentifyVariable() to check if the given name is a variable without
> > locking it and later try to lock the variable if you get a valid Oid.
> > This is
> > bug prone as any other backend could drop the variable between the two
> > calls
> > and you would end up with a cache lookup failure.  I think the lock should
> > be
> > acquired during IdentifyVariable.  It should probably be optional as one
> > codepath only needs the information to raise a warning when a variable is
> > shadowed, so a concurrent drop isn't a problem there.
> >
>
> I moved lock to IdentifyVariable routine

+IdentifyVariable(List *names, char **attrname, bool lockit, bool *not_unique)
+{
[...]
+               return varoid_without_attr;
+           }
+           else
+           {
+               *attrname = c;
+               return varoid_with_attr;
[...]
+
+   if (OidIsValid(varid) && lockit)
+       LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
+
+   return varid;

There are still some code paths that may not lock the target variable when
required.

fixed
 

Also, the function comment doesn't say much about attrname handling, it should
be clarifed.  I think it should initially be set to NULL, to make sure that
it's always a valid pointer after the function returns.

done
 


> attached updated patch

Various comments on the patch:

No test for GRANT/REVOKE ... ALL VARIABLES IN SCHEMA, maybe it would be good to
have one?

done
 

Documentation:

catalogs.sgml:

You're still using the old-style 4 columns table, it should be a single column
like the rest of the file.

done
 

+  <para>
+   The <command>CREATE VARIABLE</command> command creates a session variable.
+   Session variables, like relations, exist within a schema and their access is
+   controlled via <command>GRANT</command> and <command>REVOKE</command>
+   commands.  Changing a session variable is non-transactional.
+  </para>

The "changing a session variable is non-transactional" is ambiguous.  I think
that only the value part isn't transactional, the variable metadata themselves
(ALTER VARIABLE and other DDL) are transactional right?  This should be
explicitly described here (although it's made less ambiguous in the next
paragraph).

sure, DDL of session variables are transactional. I removed this sentence.


+  <para>
+   Session variables are retrieved by the <command>SELECT</command> SQL
+   command.  Their value is set with the <command>LET</command> SQL command.
+   While session variables share properties with tables, their value cannot be
+   updated with an <command>UPDATE</command> command.
+  </para>

should this part mention that session variables can be shadowed?  For now the
only mention to that is in advanced.sgml.

good idea, I wrote note about it there
 

+      The <literal>DEFAULT</literal> clause can be used to assign a default
+      value to a session variable.

The expression is lazily evaluated during the session first use of the
variable.  This should be documented as any usage of volatile expression will
be impacted.

done

 

+      The <literal>ON TRANSACTION END RESET</literal>
+      clause causes the session variable to be reset to its default value when
+      the transaction is committed or rolled back.

As far as I can see this clauses doesn't play well with IMMUTABLE VARIABLE, as
you can reassign a value once the transaction ends.  Same for DISCARD [ ALL |
VARIABLES ], or LET var = NULL (or DEFAULT if no default value).  Is that
intended?

I think so it is expected. The life scope of assigned (immutable) value is limited to transaction (when ON TRANSACTION END RESET).
DISCARD is used for reset of session, and after it, you can write the value first time.

I enhanced doc in IMMUTABLE clause


+   <literal>LET</literal> extends the syntax defined in the SQL
+   standard. The <literal>SET</literal> command from the SQL standard
+   is used for different purposes in <productname>PostgreSQL</productname>.

I don't fully understand that.  Are (session) variables defined in the SQL
standard?  If yes, all the other documentation pages should clarify that as
they currently say that this is a postgres extension.  If not, this part should
made it clear what is defined in the standard.

I reread standard more carefully, and it looks so SQL/PSM doesn't define global variables ever. The modules defined by SQL/PSM can holds only temporal tables or routines. Unfortunately, this part of standard is almost dead, and there is not referential implementation. The most near to standard in this area is DB2, but global session variables are proprietary feature. The usage is very similar to our session variables with one significant difference - the global session variables can be modified by commands SELECT INTO, VALUES INTO, EXECUTE INTO and SET (Our session variables can be modified just by LET command.). I am sure, so if SQL/PSM supports global session variables, then it uses SET statement - like DB2, but I didn't find any note about support in standard.

I think so the best comment to compatibility is just

  <para>
   The <command>LET</command> is a <productname>PostgreSQL</productname>
   extension.
  </para>

 

In revoke.sgml:
+ REVOKE [ GRANT OPTION FOR ]
+     { { READ | WRITE } [, ...] | ALL [ PRIVILEGES ] }
+     ON VARIABLE <replaceable>variable_name</replaceable> [, ...]
+     FROM { [ GROUP ] <replaceable class="parameter">role_name</replaceable> | PUBLIC } [, ...]
+     [ CASCADE | RESTRICT ]

there's no extra documentation for that, and therefore no clarification on
variable_name.

This is same like function_name, domain_name, ...
 

VariableIsVisible():
+                * If it is in the path, it might still not be visible; it could be
+                * hidden by another relation of the same name earlier in the path. So
+                * we must do a slow check for conflicting relations.

should it be "another variable of the same name"?


yes, fixed

 

Tab completion: CREATE IMMUTABLE VARIABLE is not handled

fixed
 


pg_variable.c:
Do we really need both session_variable_get_name() and
get_session_variable_name()?

They are different - first returns possibly qualified name, second returns only name. Currently it is used just for error messages in transformAssignmentIndirection, and I think so it is good for consistency with other usage of this routine (transformAssignmentIndirection).
 

+/*
+ * Fetch all fields of session variable from the syscache.
+ */
+void
+initVariable(Variable *var, Oid varid, bool missing_ok, bool fast_only)

As least fast_only should be documented in the function comment, especially
regarding var->varname, since:

+   var->oid = varid;
+   var->name = pstrdup(NameStr(varform->varname));
[...]
+   if (!fast_only)
+   {
+       Datum       aclDatum;
+       bool        isnull;
+
+       /* name */
+       var->name = pstrdup(NameStr(varform->varname));A
[...]
+   else
+   {
+       var->name = NULL;

is the output value guaranteed or not?  In any case it shouldn't be set twice.

It was messed, fixed
 

Also, I don't see any caller for missing_ok == true, should we remove it?

removed
 

+/*
+ * Create entry in pg_variable table
+ */
+ObjectAddress
+VariableCreate(const char *varName,
[...]
+   /* dependency on any roles mentioned in ACL */
+   if (varacl != NULL)
+   {
+       int         nnewmembers;
+       Oid        *newmembers;
+
+       nnewmembers = aclmembers(varacl, &newmembers);
+       updateAclDependencies(VariableRelationId, varid, 0,
+                             varOwner,
+                             0, NULL,
+                             nnewmembers, newmembers);

Shouldn't you use recordDependencyOnNewAcl() instead?  Also, sn't it missing a
recordDependencyOnOwner()?

changed and fixed
 

sessionvariable.c:

+ * Although session variables are not transactional, we don't
+ * want (and we cannot) to run cleaning immediately (when we
+ * got sinval message). The value of session variables can
+ * be still used or the operation that emits cleaning can be
+ * reverted. Unfortunatelly, this check can be done only in
+ * when transaction is committed (the check against system
+ * catalog requires transaction state).

This was the original idea, but since there's now locking to make all DDL safe,
the metadata should be considered fully transactional and no session should
still be able to use a concurrently dropped variable.  Also, the invalidation
messages are not sent until the transaction is committed.  So is that approach
still needed (at least for things outside ON COMMIT DROP / ON TRANSACTION END
RESET)?

I enhanced comment
 

I'm also attaching a 3rd patch with some proposition for documentation
rewording (including consistent use of *session* variable), a few comments
rewording, copyright year bump and minor things like that.

Thank you very much for it. This patch is based on your changes.

Regards

Pavel
 

Note that I still didn't really review pg_variable.c or sessionvariable.c since
there might be significant changes there for either the sinval / immutable part
I mentioned.
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Fri, Jan 28, 2022 at 07:51:08AM +0100, Pavel Stehule wrote:
> st 26. 1. 2022 v 8:23 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> 
> > +      The <literal>ON TRANSACTION END RESET</literal>
> > +      clause causes the session variable to be reset to its default value
> > when
> > +      the transaction is committed or rolled back.
> >
> > As far as I can see this clauses doesn't play well with IMMUTABLE
> > VARIABLE, as
> > you can reassign a value once the transaction ends.  Same for DISCARD [
> > ALL |
> > VARIABLES ], or LET var = NULL (or DEFAULT if no default value).  Is that
> > intended?
> >
> 
> I think so it is expected. The life scope of assigned (immutable) value is
> limited to transaction (when ON TRANSACTION END RESET).
> DISCARD is used for reset of session, and after it, you can write the value
> first time.
> 
> I enhanced doc in IMMUTABLE clause

I think it's still somewhat unclear:

-      done, no other change will be allowed in the session lifetime.
+      done, no other change will be allowed in the session variable content's
+      lifetime. The lifetime of content of session variable can be
+      controlled by <literal>ON TRANSACTION END RESET</literal> clause.
+     </para>

The "session variable content lifetime" is quite peculiar, as the ON
TRANSACTION END RESET is adding transactional behavior to something that's not
supposed to be transactional, so more documentation about it seems appropriate.

Also DISCARD can be used any time so that's a totally different aspect of the
immutable variable content lifetime that's not described here.

NULL handling also seems inconsistent.  An explicit default NULL value makes it
truly immutable, but manually assigning NULL is a different codepath that has a
different user behavior:

# create immutable variable var_immu int default null;
CREATE VARIABLE

# let var_immu = 1;
ERROR:  22005: session variable "var_immu" is declared IMMUTABLE

# create immutable variable var_immu2 int ;
CREATE VARIABLE

# let var_immu2 = null;
LET

# let var_immu2 = null;
LET

# let var_immu2 = 1;
LET

For var_immu2 I think that the last 2 queries should have errored out.

> > In revoke.sgml:
> > + REVOKE [ GRANT OPTION FOR ]
> > +     { { READ | WRITE } [, ...] | ALL [ PRIVILEGES ] }
> > +     ON VARIABLE <replaceable>variable_name</replaceable> [, ...]
> > +     FROM { [ GROUP ] <replaceable
> > class="parameter">role_name</replaceable> | PUBLIC } [, ...]
> > +     [ CASCADE | RESTRICT ]
> >
> > there's no extra documentation for that, and therefore no clarification on
> > variable_name.
> >
> 
> This is same like function_name, domain_name, ...

Ah right.

> > pg_variable.c:
> > Do we really need both session_variable_get_name() and
> > get_session_variable_name()?
> >
> 
> They are different - first returns possibly qualified name, second returns
> only name. Currently it is used just for error messages in
> transformAssignmentIndirection, and I think so it is good for consistency
> with other usage of this routine (transformAssignmentIndirection).

I agree that consistency with other usage is a good thing, but both functions
have very similar and confusing names.  Usually when you need the qualified
name the calling code just takes care of doing so.  Wouldn't it be better to
add say get_session_variable_namespace() and construct the target string in the
calling code?

Also, I didn't dig a lot but I didn't see other usage with optionally qualified
name there?  I'm not sure how it would make sense anyway since LET semantics
are different and the current call for session variable emit incorrect
messages:

# create table tt(id integer);
CREATE TABLE

# create variable vv tt;
CREATE VARIABLE

# let vv.meh = 1;
ERROR:  42703: cannot assign to field "meh" of column "meh" because there is no such column in data type tt
LINE 1: let vv.meh = 1;



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 29. 1. 2022 v 6:19 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Fri, Jan 28, 2022 at 07:51:08AM +0100, Pavel Stehule wrote:
> st 26. 1. 2022 v 8:23 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> > +      The <literal>ON TRANSACTION END RESET</literal>
> > +      clause causes the session variable to be reset to its default value
> > when
> > +      the transaction is committed or rolled back.
> >
> > As far as I can see this clauses doesn't play well with IMMUTABLE
> > VARIABLE, as
> > you can reassign a value once the transaction ends.  Same for DISCARD [
> > ALL |
> > VARIABLES ], or LET var = NULL (or DEFAULT if no default value).  Is that
> > intended?
> >
>
> I think so it is expected. The life scope of assigned (immutable) value is
> limited to transaction (when ON TRANSACTION END RESET).
> DISCARD is used for reset of session, and after it, you can write the value
> first time.
>
> I enhanced doc in IMMUTABLE clause

I think it's still somewhat unclear:

-      done, no other change will be allowed in the session lifetime.
+      done, no other change will be allowed in the session variable content's
+      lifetime. The lifetime of content of session variable can be
+      controlled by <literal>ON TRANSACTION END RESET</literal> clause.
+     </para>

The "session variable content lifetime" is quite peculiar, as the ON
TRANSACTION END RESET is adding transactional behavior to something that's not
supposed to be transactional, so more documentation about it seems appropriate.

Also DISCARD can be used any time so that's a totally different aspect of the
immutable variable content lifetime that's not described here.

fixed
 

 

NULL handling also seems inconsistent.  An explicit default NULL value makes it
truly immutable, but manually assigning NULL is a different codepath that has a
different user behavior:

# create immutable variable var_immu int default null;
CREATE VARIABLE

# let var_immu = 1;
ERROR:  22005: session variable "var_immu" is declared IMMUTABLE

# create immutable variable var_immu2 int ;
CREATE VARIABLE

# let var_immu2 = null;
LET

# let var_immu2 = null;
LET

# let var_immu2 = 1;
LET

For var_immu2 I think that the last 2 queries should have errored out.

ok, I changed this behave
 

> > In revoke.sgml:
> > + REVOKE [ GRANT OPTION FOR ]
> > +     { { READ | WRITE } [, ...] | ALL [ PRIVILEGES ] }
> > +     ON VARIABLE <replaceable>variable_name</replaceable> [, ...]
> > +     FROM { [ GROUP ] <replaceable
> > class="parameter">role_name</replaceable> | PUBLIC } [, ...]
> > +     [ CASCADE | RESTRICT ]
> >
> > there's no extra documentation for that, and therefore no clarification on
> > variable_name.
> >
>
> This is same like function_name, domain_name, ...

Ah right.

> > pg_variable.c:
> > Do we really need both session_variable_get_name() and
> > get_session_variable_name()?
> >
>
> They are different - first returns possibly qualified name, second returns
> only name. Currently it is used just for error messages in
> transformAssignmentIndirection, and I think so it is good for consistency
> with other usage of this routine (transformAssignmentIndirection).

I agree that consistency with other usage is a good thing, but both functions
have very similar and confusing names.  Usually when you need the qualified
name the calling code just takes care of doing so.  Wouldn't it be better to
add say get_session_variable_namespace() and construct the target string in the
calling code?

ok, I rewrote related code
 

Also, I didn't dig a lot but I didn't see other usage with optionally qualified
name there?  I'm not sure how it would make sense anyway since LET semantics
are different and the current call for session variable emit incorrect
messages:

changed


# create table tt(id integer);
CREATE TABLE

# create variable vv tt;
CREATE VARIABLE

# let vv.meh = 1;
ERROR:  42703: cannot assign to field "meh" of column "meh" because there is no such column in data type tt
LINE 1: let vv.meh = 1;

 fixed

postgres=#  create table tt(id integer); create variable vv tt;
CREATE TABLE
CREATE VARIABLE
postgres=# let vv.meh = 1;
ERROR:  cannot assign to field "meh" of column or variable "vv" because there is no such column in data type tt
LINE 1: let vv.meh = 1;
            ^

Regards

Pavel

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

rebase after 02b8048ba5dc36238f3e7c3c58c5946220298d71

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Sun, Jan 30, 2022 at 08:09:18PM +0100, Pavel Stehule wrote:
> 
> rebase after 02b8048ba5dc36238f3e7c3c58c5946220298d71

Here are a few comments, mostly about pg_variable.c and sessionvariable.c.  I
stopped before reading the whole patch as I have some concern about the sinval
machanism, which ould change a bit the rest of the patch.  I'm also attaching a
patch (with .txt extension to avoid problem with the cfbot) with some comment
update propositions.

In sessionvariable.c, why VariableEOXAction and VariableEOXActionCodes?  Can't
the parser emit directly the char value, like e.g. relpersistence?

extraneous returns for 2 functions:

+void
+get_session_variable_type_typmod_collid(Oid varid, Oid *typid, int32 *typmod,
+                                       Oid *collid)
+{
[...]
+   return;
+}

+void
+initVariable(Variable *var, Oid varid, bool fast_only)
+{
[...]
+   return;
+}

VariableCreate():

Maybe add a bunch of AssertArg() for all the mandatory parametrers?

Also, the check for variable already existing should be right after the
AssertArg(), and using SearchSysCacheExistsX().

Maybe also adding an Assert(OidIsValid(xxxoid)) just after the
CatalogTupleInsert(), similarly to some other creation functions?


event-triggers.sgml needs updating for the firing matrix, as session variable
are compatible with even triggers.

+typedef enum SVariableXActAction
+{
+   ON_COMMIT_DROP,     /* used for ON COMMIT DROP */
+   ON_COMMIT_RESET,    /* used for DROP VARIABLE */
+   RESET,              /* used for ON TRANSACTION END RESET */
+   RECHECK             /* recheck if session variable is living */
+} SVariableXActAction;

The names seem a bit generic, maybe add a prefix like SVAR_xxx?

ON_COMMIT_RESET is also confusing as it looks like an SQL clause.  Maybe
PERFORM_DROP or something?

+static List *xact_drop_actions = NIL;
+static List *xact_reset_actions = NIL;

Maybe add a comment saying both are lists of SVariableXActAction?

+typedef SVariableData * SVariable;

looks like a missing bump to typedefs.list.

+char *
+get_session_variable_name(Oid varid)
+{
+   HeapTuple   tup;
+   Form_pg_variable varform;
+   char       *varname;
+
+   tup = SearchSysCache1(VARIABLEOID, ObjectIdGetDatum(varid));
+
+   if (!HeapTupleIsValid(tup))
+       elog(ERROR, "cache lookup failed for session variable %u", varid);
+
+   varform = (Form_pg_variable) GETSTRUCT(tup);
+
+   varname = NameStr(varform->varname);
+
+   ReleaseSysCache(tup);
+
+   return varname;
+}

This kind of function should return a palloc'd copy of the name.

+void
+ResetSessionVariables(void)
[...]
+   list_free_deep(xact_drop_actions);
+   xact_drop_actions = NIL;
+
+   list_free_deep(xact_reset_actions);
+   xact_drop_actions = NIL;
+}

The 2nd chunk should be xact_reset_actions = NIL

+static void register_session_variable_xact_action(Oid varid, SVariableXActAction action);
+static void delete_session_variable_xact_action(Oid varid, SVariableXActAction action);

The naming is a bit confusing, maybe unregister_session_cable_xact_action() for
consistency?

+void
+register_session_variable_xact_action(Oid varid,
+                                     SVariableXActAction action)

the function is missing the static keyword.

In AtPreEOXact_SessionVariable_on_xact_actions(), those 2 instructions are
executed twice (once in the middle and once at the end):

    list_free_deep(xact_drop_actions);
    xact_drop_actions = NIL;



+    * If this entry was created during the current transaction,
+    * creating_subid is the ID of the creating subxact; if created in a prior
+    * transaction, creating_subid is zero.

I don't see any place in the code where creating_subid can be zero? It looks
like it's only there for future transactional implementation, but for now this
attribute seems unnecessary?


        /* at transaction end recheck sinvalidated variables */
        RegisterXactCallback(sync_sessionvars_xact_callback, NULL);

I don't think it's ok to use xact callback for in-core code.  The function
explicitly says:

> * These functions are intended for use by dynamically loaded modules.
> * For built-in modules we generally just hardwire the appropriate calls
> * (mainly because it's easier to control the order that way, where needed).

Also, this function and AtPreEOXact_SessionVariable_on_xact_actions() are
skipping all or part of the processing if there is no active transaction.  Is
that really ok?

I'm particularly sceptical about AtPreEOXact_SessionVariable_on_xact_actions
and the RECHECK actions, as the xact_reset_actions list is reset whether the
recheck was done or not, so it seems to me that it could be leaking some
entries in the hash table.  If the database has a lot of object, it seems
possible (while unlikely) that a subsequent CREATE VARIABLE can get the same
oid leading to incorrect results?

If that's somehow ok, wouldn't it be better to rearrange the code to call those
functions less often, and only when they can do their work, or at least split
the recheck in some different function / list?

+static void
+pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)
[...]
+   if (hashvalue != 0)
+   {
[...]
+   }
+   else
+       sync_sessionvars_all = true;

The rechecks being somewhat expensive, I think it could be a win to remove all
pending rechecks when setting the sync_sessionvars_all.

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 2. 2. 2022 v 15:09 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Sun, Jan 30, 2022 at 08:09:18PM +0100, Pavel Stehule wrote:
>
> rebase after 02b8048ba5dc36238f3e7c3c58c5946220298d71

Here are a few comments, mostly about pg_variable.c and sessionvariable.c.  I
stopped before reading the whole patch as I have some concern about the sinval
machanism, which ould change a bit the rest of the patch.  I'm also attaching a
patch (with .txt extension to avoid problem with the cfbot) with some comment
update propositions.

merged, thank you
 

In sessionvariable.c, why VariableEOXAction and VariableEOXActionCodes?  Can't
the parser emit directly the char value, like e.g. relpersistence?


good idea, it reduces some not too useful code.

removed

 
extraneous returns for 2 functions:

+void
+get_session_variable_type_typmod_collid(Oid varid, Oid *typid, int32 *typmod,
+                                       Oid *collid)
+{
[...]
+   return;
+}

+void
+initVariable(Variable *var, Oid varid, bool fast_only)
+{
[...]
+   return;
+}

removed, fixed


VariableCreate():

Maybe add a bunch of AssertArg() for all the mandatory parametrers?


done

 
Also, the check for variable already existing should be right after the
AssertArg(), and using SearchSysCacheExistsX().

Maybe also adding an Assert(OidIsValid(xxxoid)) just after the
CatalogTupleInsert(), similarly to some other creation functions?



done
 
event-triggers.sgml needs updating for the firing matrix, as session variable
are compatible with even triggers.

done
 

+typedef enum SVariableXActAction
+{
+   ON_COMMIT_DROP,     /* used for ON COMMIT DROP */
+   ON_COMMIT_RESET,    /* used for DROP VARIABLE */
+   RESET,              /* used for ON TRANSACTION END RESET */
+   RECHECK             /* recheck if session variable is living */
+} SVariableXActAction;

The names seem a bit generic, maybe add a prefix like SVAR_xxx?

done
 

ON_COMMIT_RESET is also confusing as it looks like an SQL clause.  Maybe
PERFORM_DROP or something?


In this case, I think so the name of this variable is accurate.

see comment

<-->/*
<--> * and if this transaction or subtransaction will be committed,
<--> * we want to enforce variable cleaning. (we don't need to wait for
<--> * sinval message). The cleaning action for one session variable
<--> * can be repeated in the action list, and it doesn't do any problem
<--> * (so we don't need to ensure uniqueness). We need separate action
<--> * than RESET, because RESET is executed on any transaction end,
<--> * but we want to execute cleaning only when thecurrent transaction
<--> * will be committed.
<--> */
<-->register_session_variable_xact_action(varid, SVAR_ON_COMMIT_RESET);

 
+static List *xact_drop_actions = NIL;
+static List *xact_reset_actions = NIL;

Maybe add a comment saying both are lists of SVariableXActAction?

done
 

+typedef SVariableData * SVariable;

looks like a missing bump to typedefs.list.

done

+char *
+get_session_variable_name(Oid varid)
+{
+   HeapTuple   tup;
+   Form_pg_variable varform;
+   char       *varname;
+
+   tup = SearchSysCache1(VARIABLEOID, ObjectIdGetDatum(varid));
+
+   if (!HeapTupleIsValid(tup))
+       elog(ERROR, "cache lookup failed for session variable %u", varid);
+
+   varform = (Form_pg_variable) GETSTRUCT(tup);
+
+   varname = NameStr(varform->varname);
+
+   ReleaseSysCache(tup);
+
+   return varname;
+}

This kind of function should return a palloc'd copy of the name.

fixed 


+void
+ResetSessionVariables(void)
[...]
+   list_free_deep(xact_drop_actions);
+   xact_drop_actions = NIL;
+
+   list_free_deep(xact_reset_actions);
+   xact_drop_actions = NIL;
+}

The 2nd chunk should be xact_reset_actions = NIL

fixed
 

+static void register_session_variable_xact_action(Oid varid, SVariableXActAction action);
+static void delete_session_variable_xact_action(Oid varid, SVariableXActAction action);

The naming is a bit confusing, maybe unregister_session_cable_xact_action() for
consistency?

changed
 

+void
+register_session_variable_xact_action(Oid varid,
+                                     SVariableXActAction action)

the function is missing the static keyword.

fixed
 

In AtPreEOXact_SessionVariable_on_xact_actions(), those 2 instructions are
executed twice (once in the middle and once at the end):

        list_free_deep(xact_drop_actions);
        xact_drop_actions = NIL;


fixed
 


+    * If this entry was created during the current transaction,
+    * creating_subid is the ID of the creating subxact; if created in a prior
+    * transaction, creating_subid is zero.

I don't see any place in the code where creating_subid can be zero? It looks
like it's only there for future transactional implementation, but for now this
attribute seems unnecessary?

The comment is not 100% valid. I removed the sentence about zero value of creating_subid.

I think so this attribute is necessary for correct behave, because these related actions lists should be always correct - you should not to drop variables 2x

and there are possible things like

begin;
create variable xx as int on transaction end reset;
let xx =100;
select xx;
savepoint s1;
drop variable xx;
rollback to s1;
rollback;

In the first version I had simplified code, and I remember, there was a problem when variables were modified in subtransaction or dropped, then I got messages related to missing objects. Implemented code is based on an already used pattern in Postgres.


                /* at transaction end recheck sinvalidated variables */
                RegisterXactCallback(sync_sessionvars_xact_callback, NULL);

I don't think it's ok to use xact callback for in-core code.  The function
explicitly says:

> * These functions are intended for use by dynamically loaded modules.
> * For built-in modules we generally just hardwire the appropriate calls
> * (mainly because it's easier to control the order that way, where needed).

It was a serious issue - after checking, I removed all related code. The sinval handler is called without hash only after ANALYZE command. In this case, we don't need to run any action.


Also, this function and AtPreEOXact_SessionVariable_on_xact_actions() are
skipping all or part of the processing if there is no active transaction.  Is
that really ok?

This part was +/- ok, although I can use just isCommit, but there was a bug. I cannot clean xact_reset_actions every time. It can be done just when isCommit. I fixed this issue
Fixed memory leaks there.
 

I'm particularly sceptical about AtPreEOXact_SessionVariable_on_xact_actions
and the RECHECK actions, as the xact_reset_actions list is reset whether the
recheck was done or not, so it seems to me that it could be leaking some
entries in the hash table.  If the database has a lot of object, it seems
possible (while unlikely) that a subsequent CREATE VARIABLE can get the same
oid leading to incorrect results?


it was buggy, I fixed it
 
If that's somehow ok, wouldn't it be better to rearrange the code to call those
functions less often, and only when they can do their work, or at least split
the recheck in some different function / list?

+static void
+pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)
[...]
+   if (hashvalue != 0)
+   {
[...]
+   }
+   else
+       sync_sessionvars_all = true;

The rechecks being somewhat expensive, I think it could be a win to remove all
pending rechecks when setting the sync_sessionvars_all.

I removed it

I am sending an updated and rebased patch.

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi


+    * If this entry was created during the current transaction,
+    * creating_subid is the ID of the creating subxact; if created in a prior
+    * transaction, creating_subid is zero.

I don't see any place in the code where creating_subid can be zero? It looks
like it's only there for future transactional implementation, but for now this
attribute seems unnecessary?

The comment is not 100% valid. I removed the sentence about zero value of creating_subid.

I lost commit with this change. I am sending updated patch.

Regards

Pavel


Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Wed, Mar 02, 2022 at 06:03:06AM +0100, Pavel Stehule wrote:
> 
> I lost commit with this change. I am sending updated patch.

Thanks a lot Pavel!

I did a more thorough review of the patch.  I'm attaching a diff (in .txt
extension) for comment improvement suggestions.  I may have misunderstood
things so feel free to discard some of it.  I will mention the comment I didn't
understand in this mail.

First, I spotted some problem in the invalidation logic.

+ * Assign sinval mark to session variable. This mark probably
+ * signalized, so the session variable was dropped. But this
+ * should be rechecked later against system catalog.
+ */
+static void
+pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)

You mention that hashvalue can only be zero for commands that can't
affect session variables (like VACUUM or ANALYZE), but that's not true.  It can
also happen in case of sinval queue overflow (see InvalidateSystemCaches()).
So in that case we should trigger a full recheck, with some heuristics on how
to detect that a cached variable is still valid.  Unfortunately the oid can
wraparound so some other check is needed to make it safe.

Also, even if we get a non-zero hashvalue in the inval callback, we can't
assume that there weren't any collision in the hash.  So the additional check
should be used there too.

We had a long off-line discussion about this with Pavel yesterday on what
heuristic to use there.  Unlike other caches where discarding an entry when it
shouldn't have been is not really problematic, the cache here contains the real
variable value so we can't discard it unless the variable was really dropped.
It should be possible to make it work, so I will let Pavel comment on which
approach he wants to use and what the drawbacks are.  I guess that this will be
the most critical part of this patch to decide whether the approach is
acceptable or not.


The rest is only minor stylistic comments.

Using -DRAW_EXPRESSION_COVERAGE_TEST I see that T_LetStmt is missing in
raw_expression_tree_walker.

ALTER and DROP both suggest "IMMUTABLE VARIABLE" as valid completion, while
it should only be usable in the CREATE [ IMMUTABLE ] VARIABLE form.

+initVariable(Variable *var, Oid varid, bool fast_only)
+{
+   var->collation = varform->varcollation;
+   var->eoxaction = varform->vareoxaction;
+   var->is_not_null = varform->varisnotnull;
+   var->is_immutable = varform->varisimmutable;

nit: eoxaction is defined after is_not_null and is_immutable, it would be
better to keep the initialization order consistent (same in VariableCreate).

+   values[Anum_pg_variable_varcollation - 1] = ObjectIdGetDatum((char) varCollation);
+   values[Anum_pg_variable_vareoxaction - 1] = CharGetDatum(eoxaction);

seems like the char cast is on the wrong variable?

+ * [...] We have to hold two separate action lists:
+ * one for dropping the session variable from system catalog, and
+ * another one for resetting its value. Both are necessary, since
+ * dropping a session variable also needs to enforce a reset of
+ * the value.

I don't fully understand that comment.  Maybe you meant that the opposite isn't
true, ie. highlight that a reset should *not* drop the variable thus two lists?

+typedef enum SVariableXActAction
+{
+   SVAR_ON_COMMIT_DROP,        /* used for ON COMMIT DROP */
+   SVAR_ON_COMMIT_RESET,       /* used for DROP VARIABLE */
+   SVAR_RESET,                 /* used for ON TRANSACTION END RESET */
+   SVAR_RECHECK                /* verify if session variable still exists */
+} SVariableXActAction;
+
+typedef struct SVariableXActActionItem
+{
+   Oid         varid;          /* varid of session variable */
+   SVariableXActAction action; /* reset or drop */

the stored action isn't simply "reset or drop", even though the resulting
action will be a reset or a drop (or a no-op) right?  Since it's storing a enum
define just before, I'd just drop the comment on action, and maybe specify that
SVAR_RECHECK will do appropriate cleanup if the session variable doesn't exist.


+ * Release the variable defined by varid from sessionvars
+ * hashtab.
+ */
+static void
+free_session_variable(SVariable svar)

The function name is a bit confusing given the previous function.  Maybe this
one should be called forget_session_variable() instead, or something like that?

I think the function comment should also mention that caller is responsible for
making sure that the sessionvars htab exists before calling it, for extra
clarity, or just add an assert for that.

+static void
+free_session_variable_varid(Oid varid)

Similary, maybe renaming this function forget_session_variable_by_id()?

+static void
+create_sessionvars_hashtable(void)
+{
+   HASHCTL     ctl;
+
+   /* set callbacks */
+   if (first_time)
+   {
+       /* Read sinval messages */
+       CacheRegisterSyscacheCallback(VARIABLEOID,
+                                     pg_variable_cache_callback,
+                                     (Datum) 0);
+
+       first_time = false;
+   }
+
+   /* needs its own long lived memory context */
+   if (SVariableMemoryContext == NULL)
+   {
+       SVariableMemoryContext =
+           AllocSetContextCreate(TopMemoryContext,
+                                 "session variables",
+                                 ALLOCSET_START_SMALL_SIZES);
+   }

As far as I can see the SVariableMemoryContext can be reset but never set to
NULL, so I think the initialization can be done in the first_time case, and
otherwise asserted that it's not NULL.

+   if (!isnull && svar->typid != typid)
+       ereport(ERROR,
+               (errcode(ERRCODE_DATATYPE_MISMATCH),
+                errmsg("type \"%s\" of assigned value is different than type \"%s\" of session variable \"%s.%

Why testing isnull?  I don't think it's ok to allow NULL::text in an int
variable for instance.  This isn't valid in other context (like inserting in a
table)

+    * result of default expression always). Don't do this check, when variable
+    * is initialized.
+    */
+   if (!init_mode &&

I think the last part of the comment is a bit misleading.  Maybe "when variable
is being initialized" (and similary same for the function comment).

+ * We try not to break the previous value, if something is wrong.
+ *
+ * As side efect this function acquires AccessShareLock on
+ * related session variable until commit.
+ */
+void
+SetSessionVariable(Oid varid, Datum value, bool isNull, Oid typid)

I don't understand what you mean by "We try not to break the previous value, if
something is wrong".

+   /* Initialize svar when not initialized or when stored value is null */
+   if (!found)
+   {
+       Variable    var;
+
+       /* don't need defexpr and acl here */
+       initVariable(&var, varid, true);
+       init_session_variable(svar, &var);
+   }
+
+   set_session_variable(svar, value, isNull, typid, false);

Shouldn't the comment be on the set_session_variable() vall rather than on the
!found block?

+ * Returns the value of the session variable specified by varid. Check correct
+ * result type. Optionally the result can be copied.
+ */
+Datum
+GetSessionVariable(Oid varid, bool *isNull, Oid expected_typid, bool copy)

All callers use copy == true, couldn't we get rid of it and say it returns a
copy of the value if any?

+ * Create new ON_COMMIT_DROP xact action. We have to drop
+ * ON COMMIT DROP variable, although this variable should not
+ * be used. So we need to register this action in CREATE VARIABLE
+ * time.

I don't understand this comment.

+AtPreEOXact_SessionVariable_on_xact_actions(bool isCommit)
+{
+   ListCell   *l;
+
+   foreach(l, xact_drop_actions)
+   {
+       SVariableXActActionItem *xact_ai =
+                           (SVariableXActActionItem *) lfirst(l);
+
+       /* Iterate only over non dropped entries */
+       if (xact_ai->deleting_subid == InvalidSubTransactionId)
+       {
+           Assert(xact_ai->action == SVAR_ON_COMMIT_DROP);

The assert sould probably be in the block above.

+            * We want to reset session variable (release it from
+            * local memory) when RESET is required or when session
+            * variable was removed explicitly (DROP VARIABLE) or
+            * implicitly (ON COMMIT DROP). Explicit releasing should
+            * be done only if the transaction is commited.
+            */
+           if ((xact_ai->action == SVAR_RESET) ||
+               (xact_ai->action == SVAR_ON_COMMIT_RESET &&
+                xact_ai->deleting_subid == InvalidSubTransactionId &&
+                isCommit))
+               free_session_variable_varid(xact_ai->varid);

This chunk is a bit hard to follow.  Also, for SVAR_RESET wouldn't it be better
to only make the svar invalid and keep it in the htab?  If so, this could be
split in two different branches which would be easier to follow.

+       if (!isCommit &&
+           xact_ai->creating_subid == mySubid &&
+           xact_ai->action != SVAR_RESET &&
+           xact_ai->action != SVAR_RECHECK)
+       {
+           /* cur_item must be removed */
+           xact_reset_actions = foreach_delete_current(xact_reset_actions, cur_item);
+           pfree(xact_ai);

I think that be definition only the SVAR_ON_COMMIT_DROP (cleaning entry for a
dropped session variable) will ever need to be removed there, so we should
check for that instead of not being something else?


+   /*
+    * Prepare session variables, if not prepared in queryDesc
+    */
+   if (queryDesc->num_session_variables > 0)

I don't understand that comment.

+static void
+svariableStartupReceiver(DestReceiver *self, int operation, TupleDesc typeinfo)
+{
+   svariableState *myState = (svariableState *) self;
+   int         natts = typeinfo->natts;
+   int         outcols = 0;
+   int         i;
+
+   for (i = 0; i < natts; i++)
+   {
+       Form_pg_attribute attr = TupleDescAttr(typeinfo, i);
+
+       if (attr->attisdropped)
+           continue;
+
+       if (++outcols > 1)
+           elog(ERROR, "svariable DestReceiver can take only one attribute");
+
+       myState->typid = attr->atttypid;
+       myState->typmod = attr->atttypmod;
+       myState->typlen = attr->attlen;
+       myState->slot_offset = i;
+   }
+
+   myState->rows = 0;
+}

Maybe add an initial Assert to make sure that caller did call
SetVariableDestReceiverParams(), and final check that one attribute was found?

@@ -1794,15 +1840,39 @@ fix_expr_common(PlannerInfo *root, Node *node)
                g->cols = cols;
        }
    }
+   else if (IsA(node, Param))
+   {
+       Param *p = (Param *) node;
+
+       if (p->paramkind == PARAM_VARIABLE)
+       {
+           PlanInvalItem *inval_item = makeNode(PlanInvalItem);
+
+           /* paramid is still session variable id */
+           inval_item->cacheId = VARIABLEOID;
+           inval_item->hashValue = GetSysCacheHashValue1(VARIABLEOID,
+                                                         ObjectIdGetDatum(p->paramvarid));
+
+           /* Append this variable to global, register dependency */
+           root->glob->invalItems = lappend(root->glob->invalItems,
+                                            inval_item);
+       }
+   }

I didn't see any test covering invalidation of cached plan using session
variables.  Could you add some?  While at it, maybe use different values on the
sesssion_variable.sql tests rather than 100 in many places, so it's easier to
identifier which case broke in case of problem.

+static Node *
+makeParamSessionVariable(ParseState *pstate,
+                       Oid varid, Oid typid, int32 typmod, Oid collid,
+                       char *attrname, int location)
+{
[...]
+   /*
+    * There are two ways to access session variables - direct, used by simple
+    * plpgsql expressions, where it is not necessary to emulate stability.
+    * And Buffered access, which is used everywhere else. We should ensure
+    * stable values, and because session variables are global, then we should
+    * work with copied values instead of directly accessing variables. For
+    * direct access, the varid is best. For buffered access, we need
+    * to assign an index to the buffer - later, when we know what variables are
+    * used. Now, we just remember, so we use session variables.

I don't understand the last part, starting with "For buffered access, we
need...".  Also, the beginning of the comment seems like something more general
and may be moved somewhere, maybe at the beginning of sessionvariable.c?

+    * stmt->query is SelectStmt node. An tranformation of
+    * this node doesn't support SetToDefault node. Instead injecting
+    * of transformSelectStmt or parse state, we can directly
+    * transform target list here if holds SetToDefault node.
+    */
+   if (stmt->set_default)

I don't understand this comment.  Especially since the next
transformTargetList() will emit SetToDefault node that will be handled later in
that function and then in RewriteQuery.

+   /*
+    * rewrite SetToDefaults needs varid in Query structure
+    */
+   query->resultVariable = varid;

I also don't understand that comment.  Is is always set just in case there's a
SetToDefault, or something else?

+   /* translate paramvarid to session variable name */
+   if (param->paramkind == PARAM_VARIABLE)
+   {
+       appendStringInfo(context->buf, "%s",
+                        generate_session_variable_name(param->paramvarid));
+       return;
+   }

A bit more work seems to be needed for deparsing session variables:

# create variable myvar text;
CREATE VARIABLE

# create view myview as select myvar;
CREATE VIEW

# \d+ myview
                          View "public.myview"
 Column | Type | Collation | Nullable | Default | Storage  | Description
--------+------+-----------+----------+---------+----------+-------------
 myvar  | text |           |          |         | extended |
View definition:
 SELECT myvar AS myvar;

There shouldn't be an explicit alias I think.

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Thu, Mar 03, 2022 at 03:06:52PM +0800, Julien Rouhaud wrote:
> Hi,
>
> On Wed, Mar 02, 2022 at 06:03:06AM +0100, Pavel Stehule wrote:
> >
> > I lost commit with this change. I am sending updated patch.

Also, another thing is the size of the patch.  It's probably the minimum to
have a consistent working implementation, but maybe we can still split it to
make review easier?

For instance, maybe having:

- the pg_variable part on its own, without a way to use them, maybe with
  syscache helpers
- the main session variable implementation and test coverage
- plpgsql support and test coverage
- pg_dump support and test coverage

It wouldn't make the main patch that small but could still help quite a bit.

Any better suggestion?



Re: Schema variables - new implementation for Postgres 15

From
Justin Pryzby
Date:
On Thu, Mar 03, 2022 at 03:06:52PM +0800, Julien Rouhaud wrote:
> Hi,
> 
> On Wed, Mar 02, 2022 at 06:03:06AM +0100, Pavel Stehule wrote:
> > 
> > I lost commit with this change. I am sending updated patch.
> 
> Thanks a lot Pavel!
> 
> I did a more thorough review of the patch.  I'm attaching a diff (in .txt
> extension) for comment improvement suggestions.  I may have misunderstood

But the attachment actually was a *.patch, so cfbot tried and failed to apply
it.

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Sat, Mar 19, 2022 at 04:46:13PM -0500, Justin Pryzby wrote:
> On Thu, Mar 03, 2022 at 03:06:52PM +0800, Julien Rouhaud wrote:
> > Hi,
> > 
> > On Wed, Mar 02, 2022 at 06:03:06AM +0100, Pavel Stehule wrote:
> > > 
> > > I lost commit with this change. I am sending updated patch.
> > 
> > Thanks a lot Pavel!
> > 
> > I did a more thorough review of the patch.  I'm attaching a diff (in .txt
> > extension) for comment improvement suggestions.  I may have misunderstood
> 
> But the attachment actually was a *.patch, so cfbot tried and failed to apply
> it.

Argh, I indeed failed to rename the patch.  Thanks!



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:

Hi


A bit more work seems to be needed for deparsing session variables:

# create variable myvar text;
CREATE VARIABLE

# create view myview as select myvar;
CREATE VIEW

# \d+ myview
                          View "public.myview"
 Column | Type | Collation | Nullable | Default | Storage  | Description
--------+------+-----------+----------+---------+----------+-------------
 myvar  | text |           |          |         | extended |
View definition:
 SELECT myvar AS myvar;

There shouldn't be an explicit alias I think.

I check this issue, and I afraid so it is not fixable. The target list entry related to session variable has not some magic value like ?column? that can be used for check if tle->resname is implicit or explicit

And in this time I cannot to use FigureColname because it doesn't work with transformed nodes. More - the Param node can be nested in SubscriptingRef or FieldSelect. It doesn't work perfectly now. See following example:

create type xt as (a int, b int);
create view b as select (10, ((random()*100)::int)::xt).b;
\d+ b
SELECT (ROW(10, (random() * 100::double precision)::integer)::xt).b AS b;

Regards

Pavel



 

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Wed, Mar 23, 2022 at 09:58:59PM +0100, Pavel Stehule wrote:
> 
> A bit more work seems to be needed for deparsing session variables:
> >
> > # create variable myvar text;
> > CREATE VARIABLE
> >
> > # create view myview as select myvar;
> > CREATE VIEW
> >
> > # \d+ myview
> >                           View "public.myview"
> >  Column | Type | Collation | Nullable | Default | Storage  | Description
> > --------+------+-----------+----------+---------+----------+-------------
> >  myvar  | text |           |          |         | extended |
> > View definition:
> >  SELECT myvar AS myvar;
> >
> > There shouldn't be an explicit alias I think.
> >
> 
> I check this issue, and I afraid so it is not fixable. The target list
> entry related to session variable has not some magic value like ?column?
> that can be used for check if tle->resname is implicit or explicit
> 
> And in this time I cannot to use FigureColname because it doesn't work with
> transformed nodes. More - the Param node can be nested in SubscriptingRef
> or FieldSelect. It doesn't work perfectly now. See following example:
> 
> create type xt as (a int, b int);
> create view b as select (10, ((random()*100)::int)::xt).b;
> \d+ b
> SELECT (ROW(10, (random() * 100::double precision)::integer)::xt).b AS b;

Fair enough.  Since there is other code that already behaves the same I agree
that it's better to not add special cases in ruleutils.c and have an explicit
alias in the deparsed view, which isn't incorrect.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

čt 3. 3. 2022 v 8:16 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Thu, Mar 03, 2022 at 03:06:52PM +0800, Julien Rouhaud wrote:
> Hi,
>
> On Wed, Mar 02, 2022 at 06:03:06AM +0100, Pavel Stehule wrote:
> >
> > I lost commit with this change. I am sending updated patch.

Also, another thing is the size of the patch.  It's probably the minimum to
have a consistent working implementation, but maybe we can still split it to
make review easier?

For instance, maybe having:

- the pg_variable part on its own, without a way to use them, maybe with
  syscache helpers
- the main session variable implementation and test coverage
- plpgsql support and test coverage
- pg_dump support and test coverage

It wouldn't make the main patch that small but could still help quite a bit.

Any better suggestion?

I am sending fresh rebased patch + separation to more patches. This split is initial, and can be changed later

Regards

Pavel



Attachment

Re: Schema variables - new implementation for Postgres 15+1

From
Justin Pryzby
Date:
On Sun, Apr 10, 2022 at 08:30:39PM +0200, Pavel Stehule wrote:
> I am sending fresh rebased patch + separation to more patches. This split
> is initial, and can be changed later

The 0001 patch requires this, but it's not included until 0003.
src/include/commands/session_variable.h

Each patch should compile and pass tests with the preceding patches, without
the following patches.  I think the regression tests should be included with
their corresponding patch.  Maybe it's ok to separate out the changes for
pg_dump, docs, and psql - but they'd have to be merged together eventually.
I realize some of this runs counter to Julien's suggestion to split patches.

The version should be changed:
+       if (fout->remoteVersion < 150000)

I enabled these, which causes the regression tests fail:

+#define COPY_PARSE_PLAN_TREES
+#define WRITE_READ_PARSE_PLAN_TREES
+#define RAW_EXPRESSION_COVERAGE_TEST

/home/pryzbyj/src/postgres/src/test/regress/results/session_variables.out   2022-04-10 15:37:32.926306124 -0500
@@ -16,7 +16,7 @@
 SET ROLE TO var_test_role;
 -- should fail
 LET var1 = 10;
-ERROR:  permission denied for session variable var1
+ERROR:  unrecognized node type: 368
...



Re: Schema variables - new implementation for Postgres 15+1

From
Julien Rouhaud
Date:
Hi,

On Sun, Apr 10, 2022 at 03:43:33PM -0500, Justin Pryzby wrote:
> On Sun, Apr 10, 2022 at 08:30:39PM +0200, Pavel Stehule wrote:
> > I am sending fresh rebased patch + separation to more patches. This split
> > is initial, and can be changed later
> 
> The 0001 patch requires this, but it's not included until 0003.
> src/include/commands/session_variable.h
> 
> Each patch should compile and pass tests with the preceding patches, without
> the following patches.  I think the regression tests should be included with
> their corresponding patch.  Maybe it's ok to separate out the changes for
> pg_dump, docs, and psql - but they'd have to be merged together eventually.
> I realize some of this runs counter to Julien's suggestion to split patches.

Note that most of my suggestions were only to make the patch easier to review,
which was mostly trying to limit a bit the core of the new code.

Unfortunately, given the feature we can't really split the patch in many and
smaller parts and expect them to be completely self contained, so I'm not
against splitting smaller chunks like psql support and whatnot.  But I'm not
convinced that it will make it easier to review.



Re: Schema variables - new implementation for Postgres 15+1

From
Pavel Stehule
Date:


ne 10. 4. 2022 v 22:43 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:
On Sun, Apr 10, 2022 at 08:30:39PM +0200, Pavel Stehule wrote:
> I am sending fresh rebased patch + separation to more patches. This split
> is initial, and can be changed later

The 0001 patch requires this, but it's not included until 0003.
src/include/commands/session_variable.h

Each patch should compile and pass tests with the preceding patches, without
the following patches.  I think the regression tests should be included with
their corresponding patch.  Maybe it's ok to separate out the changes for
pg_dump, docs, and psql - but they'd have to be merged together eventually.
I realize some of this runs counter to Julien's suggestion to split patches.

fixed
 

The version should be changed:
+       if (fout->remoteVersion < 150000)

currently, there is not branch for PostgreSQL 16, but I'll fix it, when new devel branch will be created
 

I enabled these, which causes the regression tests fail:

+#define COPY_PARSE_PLAN_TREES
+#define WRITE_READ_PARSE_PLAN_TREES
+#define RAW_EXPRESSION_COVERAGE_TEST

/home/pryzbyj/src/postgres/src/test/regress/results/session_variables.out   2022-04-10 15:37:32.926306124 -0500
@@ -16,7 +16,7 @@
 SET ROLE TO var_test_role;
 -- should fail
 LET var1 = 10;
-ERROR:  permission denied for session variable var1
+ERROR:  unrecognized node type: 368
...

fixed

I can divide regress tests, but in reality, this is just one feature, and it is hard to separate. Regress tests need the first 4 patches to be possible to test something useful.

Regards

Pavel


 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 3. 3. 2022 v 8:06 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Wed, Mar 02, 2022 at 06:03:06AM +0100, Pavel Stehule wrote:
>
> I lost commit with this change. I am sending updated patch.

Thanks a lot Pavel!

I did a more thorough review of the patch.  I'm attaching a diff (in .txt
extension) for comment improvement suggestions.  I may have misunderstood
things so feel free to discard some of it.  I will mention the comment I didn't
understand in this mail.

First, I spotted some problem in the invalidation logic.

+ * Assign sinval mark to session variable. This mark probably
+ * signalized, so the session variable was dropped. But this
+ * should be rechecked later against system catalog.
+ */
+static void
+pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)

You mention that hashvalue can only be zero for commands that can't
affect session variables (like VACUUM or ANALYZE), but that's not true.  It can
also happen in case of sinval queue overflow (see InvalidateSystemCaches()).
So in that case we should trigger a full recheck, with some heuristics on how
to detect that a cached variable is still valid.  Unfortunately the oid can
wraparound so some other check is needed to make it safe.

Also, even if we get a non-zero hashvalue in the inval callback, we can't
assume that there weren't any collision in the hash.  So the additional check
should be used there too.

We had a long off-line discussion about this with Pavel yesterday on what
heuristic to use there.  Unlike other caches where discarding an entry when it
shouldn't have been is not really problematic, the cache here contains the real
variable value so we can't discard it unless the variable was really dropped.
It should be possible to make it work, so I will let Pavel comment on which
approach he wants to use and what the drawbacks are.  I guess that this will be
the most critical part of this patch to decide whether the approach is
acceptable or not.

I thought more about this issue, and I think it is solvable, although differently (little bit than we talked about). The check based on oid and xmin should not be enough for consistency check, because xmin can be quickly lost when a user executes VACUUM FREEZE or VACUUM FULL.

The consistency of a stored session variable should be checked always when the session variable is used (for reading) the first time in a transaction.  When value is created and used in the same transaction, then the consistency check is not necessary. When consistency check fails, then stored value is marked as broken and cannot be read. Can be overwritten.

We can believe that session variables based on buildin types are always consistent.

Composite types should be checked recursively from top to buildin types. It means we should hold tupledescs for all nested composites. Initially the check can be very strict.

Last case is consistency check for types owned by some extensions. For this case we can accept the version number of related extensions. Without change we can believe so the stored binary data are consistent.



The rest is only minor stylistic comments.

Using -DRAW_EXPRESSION_COVERAGE_TEST I see that T_LetStmt is missing in
raw_expression_tree_walker.

fixed
 

ALTER and DROP both suggest "IMMUTABLE VARIABLE" as valid completion, while
it should only be usable in the CREATE [ IMMUTABLE ] VARIABLE form.

fixed
 

+initVariable(Variable *var, Oid varid, bool fast_only)
+{
+   var->collation = varform->varcollation;
+   var->eoxaction = varform->vareoxaction;
+   var->is_not_null = varform->varisnotnull;
+   var->is_immutable = varform->varisimmutable;

nit: eoxaction is defined after is_not_null and is_immutable, it would be
better to keep the initialization order consistent (same in VariableCreate).

fixed
 

+   values[Anum_pg_variable_varcollation - 1] = ObjectIdGetDatum((char) varCollation);
+   values[Anum_pg_variable_vareoxaction - 1] = CharGetDatum(eoxaction);

seems like the char cast is on the wrong variable?

fixed
 

+ * [...] We have to hold two separate action lists:
+ * one for dropping the session variable from system catalog, and
+ * another one for resetting its value. Both are necessary, since
+ * dropping a session variable also needs to enforce a reset of
+ * the value.

I don't fully understand that comment.  Maybe you meant that the opposite isn't
true, ie. highlight that a reset should *not* drop the variable thus two lists?

I tried to describe the issue in the comment. When I have just one action list, then I had a problem with impossibility to extend this list about reset action enforced by drop variable when I iterated over this list in xact time. This issue was solved by using two lists - one for drop and second for reset and recheck.
 

+typedef enum SVariableXActAction
+{
+   SVAR_ON_COMMIT_DROP,        /* used for ON COMMIT DROP */
+   SVAR_ON_COMMIT_RESET,       /* used for DROP VARIABLE */
+   SVAR_RESET,                 /* used for ON TRANSACTION END RESET */
+   SVAR_RECHECK                /* verify if session variable still exists */
+} SVariableXActAction;
+
+typedef struct SVariableXActActionItem
+{
+   Oid         varid;          /* varid of session variable */
+   SVariableXActAction action; /* reset or drop */

the stored action isn't simply "reset or drop", even though the resulting
action will be a reset or a drop (or a no-op) right?  Since it's storing a enum
define just before, I'd just drop the comment on action, and maybe specify that
SVAR_RECHECK will do appropriate cleanup if the session variable doesn't exist.

done
 

+ * Release the variable defined by varid from sessionvars
+ * hashtab.
+ */
+static void
+free_session_variable(SVariable svar)

The function name is a bit confusing given the previous function.  Maybe this
one should be called forget_session_variable() instead, or something like that?

I think the function comment should also mention that caller is responsible for
making sure that the sessionvars htab exists before calling it, for extra
clarity, or just add an assert for that.

+static void
+free_session_variable_varid(Oid varid)

Similary, maybe renaming this function forget_session_variable_by_id()?

I don't like "forget" too much - maybe "remove" can be used instead - like HASH_REMOVE


+static void
+create_sessionvars_hashtable(void)
+{
+   HASHCTL     ctl;
+
+   /* set callbacks */
+   if (first_time)
+   {
+       /* Read sinval messages */
+       CacheRegisterSyscacheCallback(VARIABLEOID,
+                                     pg_variable_cache_callback,
+                                     (Datum) 0);
+
+       first_time = false;
+   }
+
+   /* needs its own long lived memory context */
+   if (SVariableMemoryContext == NULL)
+   {
+       SVariableMemoryContext =
+           AllocSetContextCreate(TopMemoryContext,
+                                 "session variables",
+                                 ALLOCSET_START_SMALL_SIZES);
+   }

As far as I can see the SVariableMemoryContext can be reset but never set to
NULL, so I think the initialization can be done in the first_time case, and
otherwise asserted that it's not NULL.

done
 

+   if (!isnull && svar->typid != typid)
+       ereport(ERROR,
+               (errcode(ERRCODE_DATATYPE_MISMATCH),
+                errmsg("type \"%s\" of assigned value is different than type \"%s\" of session variable \"%s.%

Why testing isnull?  I don't think it's ok to allow NULL::text in an int
variable for instance.  This isn't valid in other context (like inserting in a
table)

changed
 

+    * result of default expression always). Don't do this check, when variable
+    * is initialized.
+    */
+   if (!init_mode &&

I think the last part of the comment is a bit misleading.  Maybe "when variable
is being initialized" (and similary same for the function comment).

changed
 

+ * We try not to break the previous value, if something is wrong.
+ *
+ * As side efect this function acquires AccessShareLock on
+ * related session variable until commit.
+ */
+void
+SetSessionVariable(Oid varid, Datum value, bool isNull, Oid typid)

I don't understand what you mean by "We try not to break the previous value, if
something is wrong".

That means, so SetSessionVariable sets a new value or should preserve the original value.
 

+   /* Initialize svar when not initialized or when stored value is null */
+   if (!found)
+   {
+       Variable    var;
+
+       /* don't need defexpr and acl here */
+       initVariable(&var, varid, true);
+       init_session_variable(svar, &var);
+   }
+
+   set_session_variable(svar, value, isNull, typid, false);

Shouldn't the comment be on the set_session_variable() vall rather than on the
!found block?

This comment is obsolete,

removed
 

+ * Returns the value of the session variable specified by varid. Check correct
+ * result type. Optionally the result can be copied.
+ */
+Datum
+GetSessionVariable(Oid varid, bool *isNull, Oid expected_typid, bool copy)

All callers use copy == true, couldn't we get rid of it and say it returns a
copy of the value if any?

I replaced it with the new function CopySessionVariableWithTypeCheck. Probably in almost all situations, the copy will be required. And if not, we can enhance the API later.


+ * Create new ON_COMMIT_DROP xact action. We have to drop
+ * ON COMMIT DROP variable, although this variable should not
+ * be used. So we need to register this action in CREATE VARIABLE
+ * time.

I don't understand this comment.

changed
 

+AtPreEOXact_SessionVariable_on_xact_actions(bool isCommit)
+{
+   ListCell   *l;
+
+   foreach(l, xact_drop_actions)
+   {
+       SVariableXActActionItem *xact_ai =
+                           (SVariableXActActionItem *) lfirst(l);
+
+       /* Iterate only over non dropped entries */
+       if (xact_ai->deleting_subid == InvalidSubTransactionId)
+       {
+           Assert(xact_ai->action == SVAR_ON_COMMIT_DROP);

The assert sould probably be in the block above.

moved
 

+            * We want to reset session variable (release it from
+            * local memory) when RESET is required or when session
+            * variable was removed explicitly (DROP VARIABLE) or
+            * implicitly (ON COMMIT DROP). Explicit releasing should
+            * be done only if the transaction is commited.
+            */
+           if ((xact_ai->action == SVAR_RESET) ||
+               (xact_ai->action == SVAR_ON_COMMIT_RESET &&
+                xact_ai->deleting_subid == InvalidSubTransactionId &&
+                isCommit))
+               free_session_variable_varid(xact_ai->varid);

This chunk is a bit hard to follow.  Also, for SVAR_RESET wouldn't it be better
to only make the svar invalid and keep it in the htab?  If so, this could be
split in two different branches which would be easier to follow.

After some experiments, I think it is more simple to remove the svar entry in htab. It reduces the state space, and variable initialization once per transaction is not expensive. The problem is in necessary xact action registration and now I can call it simply just from init_session_variable. I updated  comments there

 

+       if (!isCommit &&
+           xact_ai->creating_subid == mySubid &&
+           xact_ai->action != SVAR_RESET &&
+           xact_ai->action != SVAR_RECHECK)
+       {
+           /* cur_item must be removed */
+           xact_reset_actions = foreach_delete_current(xact_reset_actions, cur_item);
+           pfree(xact_ai);

I think that be definition only the SVAR_ON_COMMIT_DROP (cleaning entry for a
dropped session variable) will ever need to be removed there, so we should
check for that instead of not being something else?


fixed
 

+   /*
+    * Prepare session variables, if not prepared in queryDesc
+    */
+   if (queryDesc->num_session_variables > 0)
 
I don't understand that comment.

I changed this comment

 

+static void
+svariableStartupReceiver(DestReceiver *self, int operation, TupleDesc typeinfo)
+{
+   svariableState *myState = (svariableState *) self;
+   int         natts = typeinfo->natts;
+   int         outcols = 0;
+   int         i;
+
+   for (i = 0; i < natts; i++)
+   {
+       Form_pg_attribute attr = TupleDescAttr(typeinfo, i);
+
+       if (attr->attisdropped)
+           continue;
+
+       if (++outcols > 1)
+           elog(ERROR, "svariable DestReceiver can take only one attribute");
+
+       myState->typid = attr->atttypid;
+       myState->typmod = attr->atttypmod;
+       myState->typlen = attr->attlen;
+       myState->slot_offset = i;
+   }
+
+   myState->rows = 0;
+}

Maybe add an initial Assert to make sure that caller did call
SetVariableDestReceiverParams(), and final check that one attribute was found?

done
 

@@ -1794,15 +1840,39 @@ fix_expr_common(PlannerInfo *root, Node *node)
                g->cols = cols;
        }
    }
+   else if (IsA(node, Param))
+   {
+       Param *p = (Param *) node;
+
+       if (p->paramkind == PARAM_VARIABLE)
+       {
+           PlanInvalItem *inval_item = makeNode(PlanInvalItem);
+
+           /* paramid is still session variable id */
+           inval_item->cacheId = VARIABLEOID;
+           inval_item->hashValue = GetSysCacheHashValue1(VARIABLEOID,
+                                                         ObjectIdGetDatum(p->paramvarid));
+
+           /* Append this variable to global, register dependency */
+           root->glob->invalItems = lappend(root->glob->invalItems,
+                                            inval_item);
+       }
+   }

I didn't see any test covering invalidation of cached plan using session
variables.  Could you add some?  While at it, maybe use different values on the
sesssion_variable.sql tests rather than 100 in many places, so it's easier to
identifier which case broke in case of problem.

I created new tests there

+static Node *
+makeParamSessionVariable(ParseState *pstate,
+                       Oid varid, Oid typid, int32 typmod, Oid collid,
+                       char *attrname, int location)
+{
[...]
+   /*
+    * There are two ways to access session variables - direct, used by simple
+    * plpgsql expressions, where it is not necessary to emulate stability.
+    * And Buffered access, which is used everywhere else. We should ensure
+    * stable values, and because session variables are global, then we should
+    * work with copied values instead of directly accessing variables. For
+    * direct access, the varid is best. For buffered access, we need
+    * to assign an index to the buffer - later, when we know what variables are
+    * used. Now, we just remember, so we use session variables.

I don't understand the last part, starting with "For buffered access, we
need...".  Also, the beginning of the comment seems like something more general
and may be moved somewhere, maybe at the beginning of sessionvariable.c?

moved to sessionvariable.c and modified.


+    * stmt->query is SelectStmt node. An tranformation of
+    * this node doesn't support SetToDefault node. Instead injecting
+    * of transformSelectStmt or parse state, we can directly
+    * transform target list here if holds SetToDefault node.
+    */
+   if (stmt->set_default)

I don't understand this comment.  Especially since the next
transformTargetList() will emit SetToDefault node that will be handled later in
that function and then in RewriteQuery.

This is messy, sorry. SelectStmt doesn't support SetToDefault. LetStmt supports it. I reworded.


+   /*
+    * rewrite SetToDefaults needs varid in Query structure
+    */
+   query->resultVariable = varid;

I also don't understand that comment.  Is is always set just in case there's a
SetToDefault, or something else?

This comment is not complete. This value is required by QueryRewriter (for replacement of the SetToDefault node by defexpr). It is required for acquiring locks, and for execution.

I rewrote this comment

 

+   /* translate paramvarid to session variable name */
+   if (param->paramkind == PARAM_VARIABLE)
+   {
+       appendStringInfo(context->buf, "%s",
+                        generate_session_variable_name(param->paramvarid));
+       return;
+   }

A bit more work seems to be needed for deparsing session variables:

# create variable myvar text;
CREATE VARIABLE

# create view myview as select myvar;
CREATE VIEW

# \d+ myview
                          View "public.myview"
 Column | Type | Collation | Nullable | Default | Storage  | Description
--------+------+-----------+----------+---------+----------+-------------
 myvar  | text |           |          |         | extended |
View definition:
 SELECT myvar AS myvar;

There shouldn't be an explicit alias I think.

this issue was described in other thread

I am sending rebased, updated patches. The type check is not implemented yet.

Regards

Pavel

 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi Pavel,

On Tue, Jul 05, 2022 at 08:42:09AM +0200, Pavel Stehule wrote:
> Hi
>
> fresh rebase + type check. Before returning any value, the related type is
> checked if it is valid still

Great news, thanks a lot for keeping working on it!  I'm still in PTO since
last Friday, but I'm planning to start reviewing this patch as soon as I come
back.  It might take a while as my knowledge of this patch are a bit blurry but
hopefully it shouldn't take too long.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


út 5. 7. 2022 v 12:50 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi Pavel,

On Tue, Jul 05, 2022 at 08:42:09AM +0200, Pavel Stehule wrote:
> Hi
>
> fresh rebase + type check. Before returning any value, the related type is
> checked if it is valid still

Great news, thanks a lot for keeping working on it!  I'm still in PTO since
last Friday, but I'm planning to start reviewing this patch as soon as I come
back.  It might take a while as my knowledge of this patch are a bit blurry but
hopefully it shouldn't take too long.

Thank you

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Justin Pryzby
Date:
On Wed, Jul 06, 2022 at 10:30:31PM +0200, Pavel Stehule wrote:
> This set of patches should to help me with investigation of regress test
> fail reported by cfbot

Do you know you can do the same as what cfbot does under your own github
account ?  Please see: src/tools/ci/README

-- 
Justin



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 7. 7. 2022 v 8:43 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:
On Wed, Jul 06, 2022 at 10:30:31PM +0200, Pavel Stehule wrote:
> This set of patches should to help me with investigation of regress test
> fail reported by cfbot

Do you know you can do the same as what cfbot does under your own github
account ?  Please see: src/tools/ci/README

I didn't know it. I am sorry for sending garbage to the mailing list.

Thank you for information

Pavel

 

--
Justin

Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
On 7/21/22 08:16, Pavel Stehule wrote:
> Hi
> 
> new update of session variable;s implementation
> 
> - fresh rebase
> - new possibility to trace execution with DEBUG1 notification
> - new SRF function pg_debug_show_used_session_variables that returns 
> content of sessionvars hashtab
> - redesign of work with list of variables for reset, recheck, on commit 
> drop, on commit reset

Hi Pavel,

I don't know exactly what failed but the docs (html/pdf) don't build:


cd ~/pg_stuff/pg_sandbox/pgsql.schema_variables/doc/src/sgml

$ make html
/usr/bin/xmllint --path . --noout --valid postgres.sgml
postgres.sgml:374: element link: validity error : IDREF attribute 
linkend references an unknown ID "catalog-pg-variable"
make: *** [Makefile:135: html-stamp] Error 4



Erik Rijkers


> 
> Regards
> 
> Pavel
> 



Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Thu, Jul 21, 2022 at 09:09:47AM +0200, Erik Rijkers wrote:
> On 7/21/22 08:16, Pavel Stehule wrote:
> > Hi
> > 
> > new update of session variable;s implementation
> > 
> > - fresh rebase
> > - new possibility to trace execution with DEBUG1 notification
> > - new SRF function pg_debug_show_used_session_variables that returns
> > content of sessionvars hashtab
> > - redesign of work with list of variables for reset, recheck, on commit
> > drop, on commit reset

Thanks for working on those!  I will keep reviewing the patchset.

> I don't know exactly what failed but the docs (html/pdf) don't build:
> 
> cd ~/pg_stuff/pg_sandbox/pgsql.schema_variables/doc/src/sgml
> 
> $ make html
> /usr/bin/xmllint --path . --noout --valid postgres.sgml
> postgres.sgml:374: element link: validity error : IDREF attribute linkend
> references an unknown ID "catalog-pg-variable"
> make: *** [Makefile:135: html-stamp] Error 4

Apparently most of the changes in catalogs.sgml didn't survive the last rebase.
I do see the needed section in v20220709-0012-documentation.patch:

> + <sect1 id="catalog-pg-variable">
> +  <title><structname>pg_variable</structname></title>
> [...]



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 21. 7. 2022 v 9:34 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Thu, Jul 21, 2022 at 09:09:47AM +0200, Erik Rijkers wrote:
> On 7/21/22 08:16, Pavel Stehule wrote:
> > Hi
> >
> > new update of session variable;s implementation
> >
> > - fresh rebase
> > - new possibility to trace execution with DEBUG1 notification
> > - new SRF function pg_debug_show_used_session_variables that returns
> > content of sessionvars hashtab
> > - redesign of work with list of variables for reset, recheck, on commit
> > drop, on commit reset

Thanks for working on those!  I will keep reviewing the patchset.

> I don't know exactly what failed but the docs (html/pdf) don't build:
>
> cd ~/pg_stuff/pg_sandbox/pgsql.schema_variables/doc/src/sgml
>
> $ make html
> /usr/bin/xmllint --path . --noout --valid postgres.sgml
> postgres.sgml:374: element link: validity error : IDREF attribute linkend
> references an unknown ID "catalog-pg-variable"
> make: *** [Makefile:135: html-stamp] Error 4

Apparently most of the changes in catalogs.sgml didn't survive the last rebase.
I do see the needed section in v20220709-0012-documentation.patch:

> + <sect1 id="catalog-pg-variable">
> +  <title><structname>pg_variable</structname></title>
> [...]

should be fixed now

Thank you for  check

Regards

Pavel

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Fri, Jul 22, 2022 at 10:58:25AM +0200, Pavel Stehule wrote:
> > Apparently most of the changes in catalogs.sgml didn't survive the last
> > rebase.
> > I do see the needed section in v20220709-0012-documentation.patch:
> >
> > > + <sect1 id="catalog-pg-variable">
> > > +  <title><structname>pg_variable</structname></title>
> > > [...]
> >
>
> should be fixed now

Thanks!  I confirm that the documentation compiles now.

As mentioned off-list, I still think that the main comment in sessionvariable.c
needs to be adapted to the new approach.  At the very least it still refers to
the previous 2 lists, but as far as I can see there are now 4 lists:

+ /* Both lists hold fields of SVariableXActActionItem type */
+ static List *xact_on_commit_drop_actions = NIL;
+ static List *xact_on_commit_reset_actions = NIL;
+
+ /*
+  * the ON COMMIT DROP and ON TRANSACTION END RESET variables
+  * are purged from memory every time.
+  */
+ static List *xact_reset_varids = NIL;
+
+ /*
+  * Holds list variable's id that that should be
+  * checked against system catalog if still live.
+  */
+ static List *xact_recheck_varids = NIL;

Apart from that, I'm not sure how much of the previous behavior changed.

It would be easier to review the new patchset having some up to date general
description of the approach.  If that's overall the same, just implemented
slightly differently I will just go ahead and dig into the patchset (although
the comments will still have to be changed eventually).

Also, one of the things that changes since the last version is:

@@ -1980,15 +1975,13 @@ AtEOSubXact_SessionVariable_on_xact_actions(bool isCommit, SubTransactionId mySu
     */
    foreach(cur_item, xact_on_commit_reset_actions)
    {
        SVariableXActActionItem *xact_ai =
                                  (SVariableXActActionItem *) lfirst(cur_item);

-       if (!isCommit &&
-           xact_ai->creating_subid == mySubid &&
-           xact_ai->action == SVAR_ON_COMMIT_DROP)
+       if (!isCommit && xact_ai->creating_subid == mySubid)

We previously discussed this off-line, but for some quick context the test was
buggy as it wasn't possible to have an SVAR_ON_COMMIT_DROP action in the
xact_on_commit_reset_actions list.  However I don't see any change in the
regression tests since the last version and the tests are all green in both
versions.

It means that was fixed but there's no test covering it.  The local memory
management is probably the hardest part of this patchset, so I'm a bit worried
if there's nothing that can catch a bug leading to leaked values or entries in
some processing list.  Do you think it's possible to add some test that would
have caught the previous bug?



Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
On 7/22/22 10:58, Pavel Stehule wrote:
> 
> čt 21. 7. 2022 v 9:34 odesílatel Julien Rouhaud <rjuju123@gmail.com 
> <mailto:rjuju123@gmail.com>> napsal:
>  
 > [v20220722] patches

Hi Pavel,

Thanks, docs now build.

Attached a few small text-changes.

Also, the pg_restore-doc still has the old name  'schema_variable' 
instead of session_variable:

-A schema_variable
--variable=schema_variable

Surely those should be changed as well.

Erik Rijkers
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Justin Pryzby
Date:
Hi Erik,

On Sun, Jul 24, 2022 at 03:39:32PM +0200, Erik Rijkers wrote:
> Attached a few small text-changes.

When you send patches like this, could you rename them to something other than
*.patch and *.diff ?

Otherwise, cfbot tries to apply *only* your patches to master, which fails due
to missing the original patches that your changes are on top of, and makes it
look like the author's patch needs to be rebased.
http://cfbot.cputube.org/pavel-stehule.html - Apply patches: FAILED

Alternately, (especially if your patch fixes a bug), you can resend the
author's patches, rebased, as [1.patch, ..., N.patch] plus your changes as
N+1.patch.  Then, cfbot tests your patches, and the author can easily review
and then integrate your changes.  (This is especially nice if the patches
currently need to be rebased, and you can make the cfbot pass at the same time
as sending fixes).

Cheers,

-- 
Justin



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

ne 24. 7. 2022 v 13:12 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Fri, Jul 22, 2022 at 10:58:25AM +0200, Pavel Stehule wrote:
> > Apparently most of the changes in catalogs.sgml didn't survive the last
> > rebase.
> > I do see the needed section in v20220709-0012-documentation.patch:
> >
> > > + <sect1 id="catalog-pg-variable">
> > > +  <title><structname>pg_variable</structname></title>
> > > [...]
> >
>
> should be fixed now

Thanks!  I confirm that the documentation compiles now.

As mentioned off-list, I still think that the main comment in sessionvariable.c
needs to be adapted to the new approach.  At the very least it still refers to
the previous 2 lists, but as far as I can see there are now 4 lists:

+ /* Both lists hold fields of SVariableXActActionItem type */
+ static List *xact_on_commit_drop_actions = NIL;
+ static List *xact_on_commit_reset_actions = NIL;
+
+ /*
+  * the ON COMMIT DROP and ON TRANSACTION END RESET variables
+  * are purged from memory every time.
+  */
+ static List *xact_reset_varids = NIL;
+
+ /*
+  * Holds list variable's id that that should be
+  * checked against system catalog if still live.
+  */
+ static List *xact_recheck_varids = NIL;

Apart from that, I'm not sure how much of the previous behavior changed.

It would be easier to review the new patchset having some up to date general
description of the approach.  If that's overall the same, just implemented
slightly differently I will just go ahead and dig into the patchset (although
the comments will still have to be changed eventually).

Also, one of the things that changes since the last version is:

@@ -1980,15 +1975,13 @@ AtEOSubXact_SessionVariable_on_xact_actions(bool isCommit, SubTransactionId mySu
     */
    foreach(cur_item, xact_on_commit_reset_actions)
    {
        SVariableXActActionItem *xact_ai =
                                  (SVariableXActActionItem *) lfirst(cur_item);

-       if (!isCommit &&
-           xact_ai->creating_subid == mySubid &&
-           xact_ai->action == SVAR_ON_COMMIT_DROP)
+       if (!isCommit && xact_ai->creating_subid == mySubid)

We previously discussed this off-line, but for some quick context the test was
buggy as it wasn't possible to have an SVAR_ON_COMMIT_DROP action in the
xact_on_commit_reset_actions list.  However I don't see any change in the
regression tests since the last version and the tests are all green in both
versions.

It means that was fixed but there's no test covering it.  The local memory
management is probably the hardest part of this patchset, so I'm a bit worried
if there's nothing that can catch a bug leading to leaked values or entries in
some processing list.  Do you think it's possible to add some test that would
have caught the previous bug?

I am sending an updated patch. I had to modify sinval message handling. Previous implementation was not robust and correct (there was some possibility, so value stored in session's variable was lost after aborted drop variable. There are new regress tests requested by Julien and some others describing the mentioned issue. I rewrote the implementation's description part in sessionvariable.c.

Erik's patches are merged. Thank you for them.

Regards

Pavel


Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Wed, Jul 27, 2022 at 09:59:18PM +0200, Pavel Stehule wrote:
> 
> ne 24. 7. 2022 v 13:12 odesílatel Julien Rouhaud <rjuju123@gmail.com>
> napsal:
> 
> > Hi,
> >
> > On Fri, Jul 22, 2022 at 10:58:25AM +0200, Pavel Stehule wrote:
> > > > Apparently most of the changes in catalogs.sgml didn't survive the last
> > > > rebase.
> > > > I do see the needed section in v20220709-0012-documentation.patch:
> > > >
> > > > > + <sect1 id="catalog-pg-variable">
> > > > > +  <title><structname>pg_variable</structname></title>
> > > > > [...]
> > > >
> > >
> > > should be fixed now
> >
> > Thanks!  I confirm that the documentation compiles now.
> >
> > As mentioned off-list, I still think that the main comment in
> > sessionvariable.c
> > needs to be adapted to the new approach.  At the very least it still
> > refers to
> > the previous 2 lists, but as far as I can see there are now 4 lists:
> >
> > + /* Both lists hold fields of SVariableXActActionItem type */
> > + static List *xact_on_commit_drop_actions = NIL;
> > + static List *xact_on_commit_reset_actions = NIL;
> > +
> > + /*
> > +  * the ON COMMIT DROP and ON TRANSACTION END RESET variables
> > +  * are purged from memory every time.
> > +  */
> > + static List *xact_reset_varids = NIL;
> > +
> > + /*
> > +  * Holds list variable's id that that should be
> > +  * checked against system catalog if still live.
> > +  */
> > + static List *xact_recheck_varids = NIL;
> >
> > Apart from that, I'm not sure how much of the previous behavior changed.
> >
> > It would be easier to review the new patchset having some up to date
> > general
> > description of the approach.  If that's overall the same, just implemented
> > slightly differently I will just go ahead and dig into the patchset
> > (although
> > the comments will still have to be changed eventually).
> >
> > Also, one of the things that changes since the last version is:
> >
> > @@ -1980,15 +1975,13 @@ AtEOSubXact_SessionVariable_on_xact_actions(bool
> > isCommit, SubTransactionId mySu
> >      */
> >     foreach(cur_item, xact_on_commit_reset_actions)
> >     {
> >         SVariableXActActionItem *xact_ai =
> >                                   (SVariableXActActionItem *)
> > lfirst(cur_item);
> >
> > -       if (!isCommit &&
> > -           xact_ai->creating_subid == mySubid &&
> > -           xact_ai->action == SVAR_ON_COMMIT_DROP)
> > +       if (!isCommit && xact_ai->creating_subid == mySubid)
> >
> > We previously discussed this off-line, but for some quick context the test
> > was
> > buggy as it wasn't possible to have an SVAR_ON_COMMIT_DROP action in the
> > xact_on_commit_reset_actions list.  However I don't see any change in the
> > regression tests since the last version and the tests are all green in both
> > versions.
> >
> > It means that was fixed but there's no test covering it.  The local memory
> > management is probably the hardest part of this patchset, so I'm a bit
> > worried
> > if there's nothing that can catch a bug leading to leaked values or
> > entries in
> > some processing list.  Do you think it's possible to add some test that
> > would
> > have caught the previous bug?
> >
> 
> I am sending an updated patch. I had to modify sinval message handling.
> Previous implementation was not robust and correct (there was some
> possibility, so value stored in session's variable was lost after aborted
> drop variable. There are new regress tests requested by Julien and some
> others describing the mentioned issue. I rewrote the implementation's
> description part in sessionvariable.c.

Thanks a lot, that's very helpful!

I looked at the new description and I'm not sure that I understand the need for
the "format change" code that tries to detect whether the underlying types was
modified.  It seems quite fragile, wouldn't it be better to have the same
behavior as for relation (detect and prevent such changes in the first place),
since both cases share the same requirements about underlying data types?  For
instance, it should be totally acceptable to drop an attribute from a custom
data type if a session variable is using it, same as if a table is using it but
as is it would be rejected for session variables.

While at it, the new comments contain a lot of non breakable spaces rather than
normal spaces.  I also just realized that there's a sessionvariable.c while the
header is named session_variable.h.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


po 1. 8. 2022 v 6:54 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Wed, Jul 27, 2022 at 09:59:18PM +0200, Pavel Stehule wrote:
>
> ne 24. 7. 2022 v 13:12 odesílatel Julien Rouhaud <rjuju123@gmail.com>
> napsal:
>
> > Hi,
> >
> > On Fri, Jul 22, 2022 at 10:58:25AM +0200, Pavel Stehule wrote:
> > > > Apparently most of the changes in catalogs.sgml didn't survive the last
> > > > rebase.
> > > > I do see the needed section in v20220709-0012-documentation.patch:
> > > >
> > > > > + <sect1 id="catalog-pg-variable">
> > > > > +  <title><structname>pg_variable</structname></title>
> > > > > [...]
> > > >
> > >
> > > should be fixed now
> >
> > Thanks!  I confirm that the documentation compiles now.
> >
> > As mentioned off-list, I still think that the main comment in
> > sessionvariable.c
> > needs to be adapted to the new approach.  At the very least it still
> > refers to
> > the previous 2 lists, but as far as I can see there are now 4 lists:
> >
> > + /* Both lists hold fields of SVariableXActActionItem type */
> > + static List *xact_on_commit_drop_actions = NIL;
> > + static List *xact_on_commit_reset_actions = NIL;
> > +
> > + /*
> > +  * the ON COMMIT DROP and ON TRANSACTION END RESET variables
> > +  * are purged from memory every time.
> > +  */
> > + static List *xact_reset_varids = NIL;
> > +
> > + /*
> > +  * Holds list variable's id that that should be
> > +  * checked against system catalog if still live.
> > +  */
> > + static List *xact_recheck_varids = NIL;
> >
> > Apart from that, I'm not sure how much of the previous behavior changed.
> >
> > It would be easier to review the new patchset having some up to date
> > general
> > description of the approach.  If that's overall the same, just implemented
> > slightly differently I will just go ahead and dig into the patchset
> > (although
> > the comments will still have to be changed eventually).
> >
> > Also, one of the things that changes since the last version is:
> >
> > @@ -1980,15 +1975,13 @@ AtEOSubXact_SessionVariable_on_xact_actions(bool
> > isCommit, SubTransactionId mySu
> >      */
> >     foreach(cur_item, xact_on_commit_reset_actions)
> >     {
> >         SVariableXActActionItem *xact_ai =
> >                                   (SVariableXActActionItem *)
> > lfirst(cur_item);
> >
> > -       if (!isCommit &&
> > -           xact_ai->creating_subid == mySubid &&
> > -           xact_ai->action == SVAR_ON_COMMIT_DROP)
> > +       if (!isCommit && xact_ai->creating_subid == mySubid)
> >
> > We previously discussed this off-line, but for some quick context the test
> > was
> > buggy as it wasn't possible to have an SVAR_ON_COMMIT_DROP action in the
> > xact_on_commit_reset_actions list.  However I don't see any change in the
> > regression tests since the last version and the tests are all green in both
> > versions.
> >
> > It means that was fixed but there's no test covering it.  The local memory
> > management is probably the hardest part of this patchset, so I'm a bit
> > worried
> > if there's nothing that can catch a bug leading to leaked values or
> > entries in
> > some processing list.  Do you think it's possible to add some test that
> > would
> > have caught the previous bug?
> >
>
> I am sending an updated patch. I had to modify sinval message handling.
> Previous implementation was not robust and correct (there was some
> possibility, so value stored in session's variable was lost after aborted
> drop variable. There are new regress tests requested by Julien and some
> others describing the mentioned issue. I rewrote the implementation's
> description part in sessionvariable.c.

Thanks a lot, that's very helpful!

I looked at the new description and I'm not sure that I understand the need for
the "format change" code that tries to detect whether the underlying types was
modified.  It seems quite fragile, wouldn't it be better to have the same
behavior as for relation (detect and prevent such changes in the first place),
since both cases share the same requirements about underlying data types?  For
instance, it should be totally acceptable to drop an attribute from a custom
data type if a session variable is using it, same as if a table is using it but
as is it would be rejected for session variables.

This is the first implementation and my strategy is "to be safe and to be strict". I did tests I know, so the test of compatibility of composite types can be more tolerant. But I use this test to test my identity against oid overflow, and I don't feel comfortable if I write this test too tolerantly. For implementation of a more precious test I need to save a signature of attributes. So the test should not be done just on
compatibility of types from TupleDesc, but it should to check attributes oid's. I had an idea to implement it in the next stage, and for this stage just to require compatibility of the vector of types.

Can this enhanced check be implemented later or do you think so it should be implemented now? I'll check how much new code it needs.
 

While at it, the new comments contain a lot of non breakable spaces rather than
normal spaces.  I also just realized that there's a sessionvariable.c while the
header is named session_variable.h.

My bad - I used gmail as a spellchecker, and it wrote some white spaces there :-/

should be fixed now



Attachment

Re: Schema variables - new implementation for Postgres 15

From
Alvaro Herrera
Date:
> diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
> index f6b740df0a..b3bee39457 100644
> --- a/src/backend/parser/parse_relation.c
> +++ b/src/backend/parser/parse_relation.c
> @@ -3667,8 +3667,8 @@ errorMissingColumn(ParseState *pstate,
>          ereport(ERROR,
>                  (errcode(ERRCODE_UNDEFINED_COLUMN),
>                   relname ?
> -                 errmsg("column %s.%s does not exist", relname, colname) :
> -                 errmsg("column \"%s\" does not exist", colname),
> +                 errmsg("column or variable %s.%s does not exist", relname, colname) :
> +                 errmsg("column or variable \"%s\" does not exist", colname),
>                   state->rfirst ? closestfirst ?
>                   errhint("Perhaps you meant to reference the column \"%s.%s\".",
>                           state->rfirst->eref->aliasname, closestfirst) :

This is in your patch 12.  I wonder -- if relname is not null, then
surely this is a column and not a variable, right?  So only the second
errmsg() here should be changed, and the first one should remain as in
the original.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
Op 19-08-2022 om 17:29 schreef Pavel Stehule:
> pá 19. 8. 2022 v 15:57 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
> napsal:
> 
>> Hi
>>
>> I am sending fresh update
>>
>> - enhanced work with composite types - now the used composite type can be
>> enhanced, reduced and stored value is converted to expected format
>> - enhancing find_composite_type_dependencies to support session variables,
>> so the type of any field of used composite type cannot be changed
>>
> 
> update - fix cpp check

v20220819-2-0001-Catalogue-support-for-session-variables.patch
v20220819-2-0002-session-variables.patch
v20220819-2-0003-typecheck-check-of-consistency-of-format-of-stored-v.patch
v20220819-2-0004-LET-command.patch
v20220819-2-0005-Support-of-LET-command-in-PLpgSQL.patch
v20220819-2-0006-DISCARD-VARIABLES-command.patch
v20220819-2-0007-Enhancing-psql-for-session-variables.patch
v20220819-2-0008-Possibility-to-dump-session-variables-by-pg_dump.patch
v20220819-2-0009-typedefs.patch
v20220819-2-0010-Regress-tests-for-session-variables.patch
v20220819-2-0011-fix.patch
v20220819-2-0012-This-patch-changes-error-message-column-doesn-t-exis.patch
v20220819-2-0013-documentation.patch

make check  fails as a result of the errors in the attached 
session_variables.out.


Erik

> 
>> Regards
>>
>> Pavel
>>
>
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
Op 20-08-2022 om 15:32 schreef Erik Rijkers:
> Op 19-08-2022 om 17:29 schreef Pavel Stehule:
> 
> make check  fails as a result of the errors in the attached 
> session_variables.out.
> 


Sorry, that should have been this diffs file, of course (attached).


Erik
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 20. 8. 2022 v 15:36 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
Op 20-08-2022 om 15:32 schreef Erik Rijkers:
> Op 19-08-2022 om 17:29 schreef Pavel Stehule:
>
> make check  fails as a result of the errors in the attached
> session_variables.out.
>


Sorry, that should have been this diffs file, of course (attached).

It looks like some problem with not well initialized memory, but I have no idea how it is possible. What are your configure options?




Erik

Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:

Op 20-08-2022 om 15:41 schreef Pavel Stehule:
> so 20. 8. 2022 v 15:36 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
> 
>> Op 20-08-2022 om 15:32 schreef Erik Rijkers:
>>> Op 19-08-2022 om 17:29 schreef Pavel Stehule:
>>>
>>> make check  fails as a result of the errors in the attached
>>> session_variables.out.
>>>
>>
>>
>> Sorry, that should have been this diffs file, of course (attached).
>>
> 
> It looks like some problem with not well initialized memory, but I have no
> idea how it is possible. What are your configure options?
> 

I compiled both assert-enable and 'normal', and I only just noticed that 
the assert-enable one did pass tests normally.


Below is the config that produced the failing tests:

./configure 
--prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.schema_variables 
--bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.schema_variables/bin.fast 
--libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.schema_variables/lib.fast 
--with-pgport=6986 --quiet --enable-depend --with-openssl --with-perl 
--with-libxml --with-libxslt --with-zlib  --enable-tap-tests 
--with-extra-version=_0820_schema_variables_1509    --with-lz4  --with-icu


(debian 9, gcc 12.2.0)

> 
>>
>> Erik
> 



Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Sat, Aug 20, 2022 at 03:55:07PM +0200, Erik Rijkers wrote:
>
> Op 20-08-2022 om 15:41 schreef Pavel Stehule:
> > so 20. 8. 2022 v 15:36 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
> >
> > > Op 20-08-2022 om 15:32 schreef Erik Rijkers:
> > > > Op 19-08-2022 om 17:29 schreef Pavel Stehule:
> > > >
> > > > make check  fails as a result of the errors in the attached
> > > > session_variables.out.
> > > >
> > >
> > >
> > > Sorry, that should have been this diffs file, of course (attached).
> > >
> >
> > It looks like some problem with not well initialized memory, but I have no
> > idea how it is possible. What are your configure options?
> >
>
> I compiled both assert-enable and 'normal', and I only just noticed that the
> assert-enable one did pass tests normally.
>
>
> Below is the config that produced the failing tests:
>
> ./configure
> --prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.schema_variables
--bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.schema_variables/bin.fast
--libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.schema_variables/lib.fast
> --with-pgport=6986 --quiet --enable-depend --with-openssl --with-perl
> --with-libxml --with-libxslt --with-zlib  --enable-tap-tests
> --with-extra-version=_0820_schema_variables_1509    --with-lz4  --with-icu

I also tried locally (didn't look at the patch yet), with debug/assert enabled,
and had similar error:

diff -dU10 /Users/rjuju/git/postgresql/src/test/regress/expected/session_variables.out
/Users/rjuju/git/pg/pgmaster_debug/src/test/regress/results/session_variables.out
--- /Users/rjuju/git/postgresql/src/test/regress/expected/session_variables.out    2022-08-20 22:25:17.000000000 +0800
+++ /Users/rjuju/git/pg/pgmaster_debug/src/test/regress/results/session_variables.out    2022-08-20 22:30:50.000000000
+0800
@@ -983,23 +983,23 @@
 -- should to fail
 SELECT public.svar;
   svar
 ---------
  (10,20)
 (1 row)

 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 -- should to fail too (different type, different generation number);
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+        svar
+--------------------
+ (10,20,2139062142)
 (1 row)

 LET public.svar = ROW(10,20,30);
 -- should be ok again for new value
 SELECT public.svar;
     svar
 ------------
  (10,20,30)
 (1 row)

@@ -1104,31 +1104,31 @@
 (1 row)

 DROP VARIABLE public.svar;
 DROP TYPE public.svar_test_type;
 CREATE TYPE public.svar_test_type AS (a int, b int);
 CREATE VARIABLE public.svar AS public.svar_test_type;
 CREATE VARIABLE public.svar2 AS public.svar_test_type;
 LET public.svar = (10, 20);
 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,16)
 (1 row)

 LET public.svar2 = (10, 20, 30);
 ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
 SELECT public.svar;
- svar
--------
- (10,)
+  svar
+---------
+ (10,16)
 (1 row)

 SELECT public.svar2;
   svar2
 ---------
  (10,30)
 (1 row)



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

so 20. 8. 2022 v 16:35 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Sat, Aug 20, 2022 at 03:55:07PM +0200, Erik Rijkers wrote:
>
> Op 20-08-2022 om 15:41 schreef Pavel Stehule:
> > so 20. 8. 2022 v 15:36 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
> >
> > > Op 20-08-2022 om 15:32 schreef Erik Rijkers:
> > > > Op 19-08-2022 om 17:29 schreef Pavel Stehule:
> > > >
> > > > make check  fails as a result of the errors in the attached
> > > > session_variables.out.
> > > >
> > >
> > >
> > > Sorry, that should have been this diffs file, of course (attached).
> > >
> >
> > It looks like some problem with not well initialized memory, but I have no
> > idea how it is possible. What are your configure options?
> >
>
> I compiled both assert-enable and 'normal', and I only just noticed that the
> assert-enable one did pass tests normally.
>
>
> Below is the config that produced the failing tests:
>
> ./configure
> --prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.schema_variables --bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.schema_variables/bin.fast --libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.schema_variables/lib.fast
> --with-pgport=6986 --quiet --enable-depend --with-openssl --with-perl
> --with-libxml --with-libxslt --with-zlib  --enable-tap-tests
> --with-extra-version=_0820_schema_variables_1509    --with-lz4  --with-icu

I also tried locally (didn't look at the patch yet), with debug/assert enabled,
and had similar error:

diff -dU10 /Users/rjuju/git/postgresql/src/test/regress/expected/session_variables.out /Users/rjuju/git/pg/pgmaster_debug/src/test/regress/results/session_variables.out
--- /Users/rjuju/git/postgresql/src/test/regress/expected/session_variables.out 2022-08-20 22:25:17.000000000 +0800
+++ /Users/rjuju/git/pg/pgmaster_debug/src/test/regress/results/session_variables.out   2022-08-20 22:30:50.000000000 +0800
@@ -983,23 +983,23 @@
 -- should to fail
 SELECT public.svar;
   svar
 ---------
  (10,20)
 (1 row)

 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 -- should to fail too (different type, different generation number);
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+        svar
+--------------------
+ (10,20,2139062142)
 (1 row)

 LET public.svar = ROW(10,20,30);
 -- should be ok again for new value
 SELECT public.svar;
     svar
 ------------
  (10,20,30)
 (1 row)

@@ -1104,31 +1104,31 @@
 (1 row)

 DROP VARIABLE public.svar;
 DROP TYPE public.svar_test_type;
 CREATE TYPE public.svar_test_type AS (a int, b int);
 CREATE VARIABLE public.svar AS public.svar_test_type;
 CREATE VARIABLE public.svar2 AS public.svar_test_type;
 LET public.svar = (10, 20);
 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,16)
 (1 row)

 LET public.svar2 = (10, 20, 30);
 ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
 SELECT public.svar;
- svar
--------
- (10,)
+  svar
+---------
+ (10,16)
 (1 row)

 SELECT public.svar2;
   svar2
 ---------
  (10,30)
 (1 row)

I hope so I found this error. It should be fixed

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 19. 8. 2022 v 22:54 odesílatel Alvaro Herrera <alvherre@alvh.no-ip.org> napsal:
> diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
> index f6b740df0a..b3bee39457 100644
> --- a/src/backend/parser/parse_relation.c
> +++ b/src/backend/parser/parse_relation.c
> @@ -3667,8 +3667,8 @@ errorMissingColumn(ParseState *pstate,
>               ereport(ERROR,
>                               (errcode(ERRCODE_UNDEFINED_COLUMN),
>                                relname ?
> -                              errmsg("column %s.%s does not exist", relname, colname) :
> -                              errmsg("column \"%s\" does not exist", colname),
> +                              errmsg("column or variable %s.%s does not exist", relname, colname) :
> +                              errmsg("column or variable \"%s\" does not exist", colname),
>                                state->rfirst ? closestfirst ?
>                                errhint("Perhaps you meant to reference the column \"%s.%s\".",
>                                                state->rfirst->eref->aliasname, closestfirst) :

This is in your patch 12.  I wonder -- if relname is not null, then
surely this is a column and not a variable, right?  So only the second
errmsg() here should be changed, and the first one should remain as in
the original.

Yes, it should work. I changed it in today patch

Thank you for tip

Regards

Pavel
 

--
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
Op 20-08-2022 om 20:09 schreef Pavel Stehule:
> Hi
> 
>>   LET public.svar2 = (10, 20, 30);
>>   ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
>>   SELECT public.svar;
>> - svar
>> --------
>> - (10,)
>> +  svar
>> +---------
>> + (10,16)
>>   (1 row)
>>
>>   SELECT public.svar2;
>>     svar2
>>   ---------
>>    (10,30)
>>   (1 row)
>>
> 
> I hope so I found this error. It should be fixed
>  > [patches v20220820-1-0001 -> 0012]


I'm afraid it still gives the same errors during  'make check', and 
again only errors when compiling  without  --enable-cassert

Thanks,

Erik


> Regards
> 
> Pavel
>
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Sat, Aug 20, 2022 at 08:44:49PM +0200, Erik Rijkers wrote:
> Op 20-08-2022 om 20:09 schreef Pavel Stehule:
> > Hi
> > 
> > >   LET public.svar2 = (10, 20, 30);
> > >   ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
> > >   SELECT public.svar;
> > > - svar
> > > --------
> > > - (10,)
> > > +  svar
> > > +---------
> > > + (10,16)
> > >   (1 row)
> > > 
> > >   SELECT public.svar2;
> > >     svar2
> > >   ---------
> > >    (10,30)
> > >   (1 row)
> > > 
> > 
> > I hope so I found this error. It should be fixed
> >  > [patches v20220820-1-0001 -> 0012]
> 
> 
> I'm afraid it still gives the same errors during  'make check', and again
> only errors when compiling  without  --enable-cassert

It still fails for me for both --enable-cassert and --disable-cassert, with a
different number of errors though.

The cfbot is green, but it's unclear to me which version was applied on the
last run.  AFAICS there's no log available for the branch creation if it
succeeds.

--enable-cassert:

 LET public.svar = (10, 20);
 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,16)
 (1 row)

 LET public.svar2 = (10, 20, 30);
 ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
 SELECT public.svar;
- svar
--------
- (10,)
+  svar
+---------
+ (10,16)
 (1 row)



--disable-cassert:

 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 -- should to fail too (different type, different generation number);
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,32)
 (1 row)

 LET public.svar = ROW(10,20,30);
 -- should be ok again for new value
 SELECT public.svar;
     svar
 ------------
  (10,20,30)
 (1 row)

@@ -1104,31 +1104,31 @@
 (1 row)

 DROP VARIABLE public.svar;
 DROP TYPE public.svar_test_type;
 CREATE TYPE public.svar_test_type AS (a int, b int);
 CREATE VARIABLE public.svar AS public.svar_test_type;
 CREATE VARIABLE public.svar2 AS public.svar_test_type;
 LET public.svar = (10, 20);
 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,16)
 (1 row)

 LET public.svar2 = (10, 20, 30);
 ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
 SELECT public.svar;
- svar
--------
- (10,)
+  svar
+---------
+ (10,16)
 (1 row)




Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 21. 8. 2022 v 6:36 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Sat, Aug 20, 2022 at 08:44:49PM +0200, Erik Rijkers wrote:
> Op 20-08-2022 om 20:09 schreef Pavel Stehule:
> > Hi
> >
> > >   LET public.svar2 = (10, 20, 30);
> > >   ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
> > >   SELECT public.svar;
> > > - svar
> > > --------
> > > - (10,)
> > > +  svar
> > > +---------
> > > + (10,16)
> > >   (1 row)
> > >
> > >   SELECT public.svar2;
> > >     svar2
> > >   ---------
> > >    (10,30)
> > >   (1 row)
> > >
> >
> > I hope so I found this error. It should be fixed
> >  > [patches v20220820-1-0001 -> 0012]
>
>
> I'm afraid it still gives the same errors during  'make check', and again
> only errors when compiling  without  --enable-cassert

It still fails for me for both --enable-cassert and --disable-cassert, with a
different number of errors though.

The cfbot is green, but it's unclear to me which version was applied on the
last run.  AFAICS there's no log available for the branch creation if it
succeeds.

--enable-cassert:

 LET public.svar = (10, 20);
 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,16)
 (1 row)

 LET public.svar2 = (10, 20, 30);
 ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
 SELECT public.svar;
- svar
--------
- (10,)
+  svar
+---------
+ (10,16)
 (1 row)



--disable-cassert:

 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 -- should to fail too (different type, different generation number);
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,32)
 (1 row)

 LET public.svar = ROW(10,20,30);
 -- should be ok again for new value
 SELECT public.svar;
     svar
 ------------
  (10,20,30)
 (1 row)

@@ -1104,31 +1104,31 @@
 (1 row)

 DROP VARIABLE public.svar;
 DROP TYPE public.svar_test_type;
 CREATE TYPE public.svar_test_type AS (a int, b int);
 CREATE VARIABLE public.svar AS public.svar_test_type;
 CREATE VARIABLE public.svar2 AS public.svar_test_type;
 LET public.svar = (10, 20);
 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,16)
 (1 row)

 LET public.svar2 = (10, 20, 30);
 ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
 SELECT public.svar;
- svar
--------
- (10,)
+  svar
+---------
+ (10,16)
 (1 row)

I got the same result, when I did build without assertions, so I can debug it now.

 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 21. 8. 2022 v 6:36 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Sat, Aug 20, 2022 at 08:44:49PM +0200, Erik Rijkers wrote:
> Op 20-08-2022 om 20:09 schreef Pavel Stehule:
> > Hi
> >
> > >   LET public.svar2 = (10, 20, 30);
> > >   ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
> > >   SELECT public.svar;
> > > - svar
> > > --------
> > > - (10,)
> > > +  svar
> > > +---------
> > > + (10,16)
> > >   (1 row)
> > >
> > >   SELECT public.svar2;
> > >     svar2
> > >   ---------
> > >    (10,30)
> > >   (1 row)
> > >
> >
> > I hope so I found this error. It should be fixed
> >  > [patches v20220820-1-0001 -> 0012]
>
>
> I'm afraid it still gives the same errors during  'make check', and again
> only errors when compiling  without  --enable-cassert

It still fails for me for both --enable-cassert and --disable-cassert, with a
different number of errors though.

The cfbot is green, but it's unclear to me which version was applied on the
last run.  AFAICS there's no log available for the branch creation if it
succeeds.

--enable-cassert:

 LET public.svar = (10, 20);
 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,16)
 (1 row)

 LET public.svar2 = (10, 20, 30);
 ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
 SELECT public.svar;
- svar
--------
- (10,)
+  svar
+---------
+ (10,16)
 (1 row)



--disable-cassert:

 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 -- should to fail too (different type, different generation number);
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,32)
 (1 row)

 LET public.svar = ROW(10,20,30);
 -- should be ok again for new value
 SELECT public.svar;
     svar
 ------------
  (10,20,30)
 (1 row)

@@ -1104,31 +1104,31 @@
 (1 row)

 DROP VARIABLE public.svar;
 DROP TYPE public.svar_test_type;
 CREATE TYPE public.svar_test_type AS (a int, b int);
 CREATE VARIABLE public.svar AS public.svar_test_type;
 CREATE VARIABLE public.svar2 AS public.svar_test_type;
 LET public.svar = (10, 20);
 ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
 SELECT public.svar;
-   svar
-----------
- (10,20,)
+    svar
+------------
+ (10,20,16)
 (1 row)

 LET public.svar2 = (10, 20, 30);
 ALTER TYPE public.svar_test_type DROP ATTRIBUTE b;
 SELECT public.svar;
- svar
--------
- (10,)
+  svar
+---------
+ (10,16)
 (1 row)

should be fixed now

Thank you for check

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
Op 21-08-2022 om 09:54 schreef Pavel Stehule:
> ne 21. 8. 2022 v 6:36 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> 
>> On Sat, Aug 20, 2022 at 08:44:49PM +0200, Erik Rijkers wrote:
>>> Op 20-08-2022 om 20:09 schreef Pavel Stehule:

> 
> should be fixed now> 


Yep, all tests OK now.
Thanks!

Erik




Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi Pavel,

On Sun, Aug 21, 2022 at 09:54:03AM +0200, Pavel Stehule wrote:
> 
> should be fixed now

I started reviewing the patchset, beginning with 0001 (at least the parts that
don't substantially change later) and have a few comments.

- you define new AclMode READ and WRITE.  Those bits are precious and I don't
  think it's ok to consume 2 bits for session variables, especially since those
  are the last two bits available since the recent GUC access control patch
  (ACL_SET and ACL_ALTER_SYSTEM).  Maybe we could existing INSERT and UPDATE
  privileges instead, like it's done for sequences?

- make check and make-check-world don't pass with this test only.  Given that
  the split is mostly done to ease review and probably not intended to be
  committed this way, we probably shouldn't spend efforts to clean up the split
  apart from making sure that each patch compiles cleanly on its own.  But in
  this case it brought my attention to misc_sanity.sql test.  Looking at patch
  0010, I see:

diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index a57fd142a9..ce9bad7211 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -60,7 +60,9 @@ ORDER BY 1, 2;
  pg_index                | indpred       | pg_node_tree
  pg_largeobject          | data          | bytea
  pg_largeobject_metadata | lomacl        | aclitem[]
-(11 rows)
+ pg_variable             | varacl        | aclitem[]
+ pg_variable             | vardefexpr    | pg_node_tree
+(13 rows)

This is the test for relations with varlena columns without TOAST table.  I
don't think that's correct to add those exceptions, and there should be a TOAST
table declared for pg_variable too, as noted in the comment above that query.

- nitpicking: s/catalogue/catalog/

Some other comments on other patches while testing things around:

- For sessionvariable.c (in 0002), I see that there are still all the comments
  and code about checking type validity based on a generation number and other
  heuristics.  I still fail to understand why this is needed at all as the
  stored datum should remain compatible as long as we prevent the few
  incompatible DDL that are also prevented when there's a relation dependency.
  As an example, I try to quickly disable all that code with the following:

diff --git a/src/backend/commands/sessionvariable.c b/src/backend/commands/sessionvariable.c
index 9b4f9482a4..7c8808dc46 100644
--- a/src/backend/commands/sessionvariable.c
+++ b/src/backend/commands/sessionvariable.c
@@ -794,6 +794,8 @@ svartype_verify_composite_fast(SVariableType svt)
 static int64
 get_svariable_valid_type_gennum(SVariableType svt)
 {
+   return 1;
+
    HeapTuple   tuple;
    bool        fast_check = true;

@@ -905,6 +907,8 @@ get_svariabletype(Oid typid)
 static bool
 session_variable_use_valid_type(SVariable svar)
 {
+   return true;
+
    Assert(svar);
    Assert(svar->svartype);

And session_variable.sql regression test still works just fine.  Am I missing
something?

While at it, the initial comment should probably say "free local memory" rather
than "purge memory".

- doc are missing for GRANT/REVOKE ... ON ALL VARIABLES

- plpgsql.sgml:
+   <sect3>
+    <title><command>Session variables and constants</command></title>

I don't think it's ok to use "constant" as an alias for immutable session
variable as immutable session variable can actually be changed.

Similarly, in catalogs.sgml:

+       <structfield>varisimmutable</structfield> <type>boolean</type>
+      </para>
+      <para>
+       True if the variable is immutable (cannot be modified). The default value is false.
+      </para></entry>
+     </row>

I think there should be a note and a link to the corresponding part in
create_variable.sgml to explain what exactly is an immutable variable, as the
implemented behavior (for nullable immutable variable) is somewhat unexpected.

- other nitpicking: pg_variable and struct Variable seems a bit inconsistent.
  For instance one uses vartype and vartypmod and the other typid and typmod,
  while both use varname and varnamespace.  I think we should avoid discrepancy
  here.

Also, there's a sessionvariable.c and a session_variable.h.  Let's use
session_variable.[ch], as it seems more readable?

-typedef patch: missing SVariableTypeData, some commits need a pgindent, e.g:

+typedef SVariableTypeData * SVariableType;

+typedef SVariableData * SVariable;

+static SessionVariableValue * RestoreSessionVariables(char **start_address,
+                                                   int *num_session_variables);

+static Query *transformLetStmt(ParseState *pstate,
+                              LetStmt * stmt);

(and multiple others)



Re: Schema variables - new implementation for Postgres 15

From
Justin Pryzby
Date:
> +-- test on query with workers
> +CREATE TABLE svar_test(a int);
> +INSERT INTO svar_test SELECT * FROM generate_series(1,1000000);

When I looked at this, I noticed this huge table.

I don't think you should create such a large table just for this.

To exercise parallel workers with a smaller table, decrease
min_parallel_table_scan_size and others as done in other regression tests.

-- 
Justin



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


po 22. 8. 2022 v 16:05 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:
> +-- test on query with workers
> +CREATE TABLE svar_test(a int);
> +INSERT INTO svar_test SELECT * FROM generate_series(1,1000000);

When I looked at this, I noticed this huge table.

I don't think you should create such a large table just for this.

To exercise parallel workers with a smaller table, decrease
min_parallel_table_scan_size and others as done in other regression tests.


I fixed it.

Thank you for tip

Pavel
--
Justin

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


po 22. 8. 2022 v 9:33 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi Pavel,

On Sun, Aug 21, 2022 at 09:54:03AM +0200, Pavel Stehule wrote:
>
> should be fixed now

I started reviewing the patchset, beginning with 0001 (at least the parts that
don't substantially change later) and have a few comments.

- you define new AclMode READ and WRITE.  Those bits are precious and I don't
  think it's ok to consume 2 bits for session variables, especially since those
  are the last two bits available since the recent GUC access control patch
  (ACL_SET and ACL_ALTER_SYSTEM).  Maybe we could existing INSERT and UPDATE
  privileges instead, like it's done for sequences?


I have not a strong opinion about it.  AclMode is uint32 - so I think there are still 15bites reserved. I think so UPDATE and SELECT rights can work, but maybe it is better to use separate rights WRITE, READ to be stronger signalized so the variable is not the relation. On other hand large objects use ACL_UPDATE, ACL_SELECT too, and it works. So I am neutral in this question. Has somebody here some opinion on this point? If not I'll modify the patch like Julien proposes.

Regards

Pavel


Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Mon, Aug 22, 2022 at 09:13:39PM +0200, Pavel Stehule wrote:
> po 22. 8. 2022 v 9:33 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> >
> > - you define new AclMode READ and WRITE.  Those bits are precious and I
> > don't
> >   think it's ok to consume 2 bits for session variables, especially since
> > those
> >   are the last two bits available since the recent GUC access control patch
> >   (ACL_SET and ACL_ALTER_SYSTEM).  Maybe we could existing INSERT and
> > UPDATE
> >   privileges instead, like it's done for sequences?
> >
> >
> I have not a strong opinion about it.  AclMode is uint32 - so I think there
> are still 15bites reserved. I think so UPDATE and SELECT rights can work,
> but maybe it is better to use separate rights WRITE, READ to be stronger
> signalized so the variable is not the relation. On other hand large objects
> use ACL_UPDATE, ACL_SELECT too, and it works. So I am neutral in this
> question. Has somebody here some opinion on this point? If not I'll modify
> the patch like Julien proposes.

Actually no, because AclMode is also used to store the grant option part.  The
comment before AclMode warns about it:

 * The present representation of AclItem limits us to 16 distinct rights,
 * even though AclMode is defined as uint32.  See utils/acl.h.



Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Tue, Jan 18, 2022 at 10:01:01PM +0100, Pavel Stehule wrote:
>
> pá 14. 1. 2022 v 3:44 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> > On Thu, Jan 13, 2022 at 07:32:26PM +0100, Pavel Stehule wrote:
> > > čt 13. 1. 2022 v 19:23 odesílatel Dean Rasheed <dean.a.rasheed@gmail.com
> > >
> > > > On Thu, 13 Jan 2022 at 17:42, Pavel Stehule <pavel.stehule@gmail.com>
> > > > wrote:
> > > > >
> > > > > I like the idea of prioritizing tables over variables with warnings
> > when
> > > > collision is detected. It cannot break anything. And it allows to using
> > > > short identifiers when there is not collision.
> > > >
> > > > Yeah, that seems OK, as long as it's clearly documented. I don't think
> > > > a warning is necessary.
> >
> > What should be the behavior for a cached plan that uses a variable when a
> > conflicting relation is later created?  I think that it should be the same
> > as a
> > search_path change and the plan should be discarded.
> >
> > > The warning can be disabled by default, but I think it should be there.
> > > This is a signal, so some in the database schema should be renamed.
> > Maybe -
> > > session_variables_ambiguity_warning.
> >
> > I agree that having a way to know that a variable has been bypassed can be
> > useful.
> >
>
> done

I've been thinking a bit more about the shadowing, and one scenario we didn't
discuss is something like this naive example:

CREATE TABLE tt(a text, b text);

CREATE TYPE abc AS (a text, b text, c text);
CREATE VARIABLE tt AS abc;

INSERT INTO tt SELECT 'a', 'b';
LET tt = ('x', 'y', 'z');

SELECT tt.a, tt.b, tt.c FROM tt;

Which, with the default configuration, currently returns

 a | b | c
---+---+---
 a | b | z
(1 row)

I feel a bit uncomfortable that the system allows mixing variable attributes
and relation columns for the same relation name.  This is even worse here as
part of the variable attributes are shadowed.

It feels like a good way to write valid queries that clearly won't do what you
think they do, a bit like the correlated sub-query trap, so maybe we should
have a way to prevent it.

What do you think?



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


út 23. 8. 2022 v 7:56 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Tue, Jan 18, 2022 at 10:01:01PM +0100, Pavel Stehule wrote:
>
> pá 14. 1. 2022 v 3:44 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> > On Thu, Jan 13, 2022 at 07:32:26PM +0100, Pavel Stehule wrote:
> > > čt 13. 1. 2022 v 19:23 odesílatel Dean Rasheed <dean.a.rasheed@gmail.com
> > >
> > > > On Thu, 13 Jan 2022 at 17:42, Pavel Stehule <pavel.stehule@gmail.com>
> > > > wrote:
> > > > >
> > > > > I like the idea of prioritizing tables over variables with warnings
> > when
> > > > collision is detected. It cannot break anything. And it allows to using
> > > > short identifiers when there is not collision.
> > > >
> > > > Yeah, that seems OK, as long as it's clearly documented. I don't think
> > > > a warning is necessary.
> >
> > What should be the behavior for a cached plan that uses a variable when a
> > conflicting relation is later created?  I think that it should be the same
> > as a
> > search_path change and the plan should be discarded.
> >
> > > The warning can be disabled by default, but I think it should be there.
> > > This is a signal, so some in the database schema should be renamed.
> > Maybe -
> > > session_variables_ambiguity_warning.
> >
> > I agree that having a way to know that a variable has been bypassed can be
> > useful.
> >
>
> done

I've been thinking a bit more about the shadowing, and one scenario we didn't
discuss is something like this naive example:

CREATE TABLE tt(a text, b text);

CREATE TYPE abc AS (a text, b text, c text);
CREATE VARIABLE tt AS abc;

INSERT INTO tt SELECT 'a', 'b';
LET tt = ('x', 'y', 'z');

SELECT tt.a, tt.b, tt.c FROM tt;

Which, with the default configuration, currently returns

 a | b | c
---+---+---
 a | b | z
(1 row)

I feel a bit uncomfortable that the system allows mixing variable attributes
and relation columns for the same relation name.  This is even worse here as
part of the variable attributes are shadowed.

It feels like a good way to write valid queries that clearly won't do what you
think they do, a bit like the correlated sub-query trap, so maybe we should
have a way to prevent it.

What do you think?

I thought about it before. I think valid RTE (but with the missing column) can shadow the variable too.

With this change your query fails:

(2022-08-23 11:05:55) postgres=# SELECT tt.a, tt.b, tt.c FROM tt;
ERROR:  column tt.c does not exist
LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
                           ^
(2022-08-23 11:06:03) postgres=# set session_variables_ambiguity_warning to on;
SET
(2022-08-23 11:06:19) postgres=# SELECT tt.a, tt.b, tt.c FROM tt;
WARNING:  session variable "tt.a" is shadowed
LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
               ^
DETAIL:  Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name.
WARNING:  session variable "tt.b" is shadowed
LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
                     ^
DETAIL:  Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name.
WARNING:  session variable "public.tt" is shadowed
LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
                           ^
DETAIL:   Session variables can be shadowed by tables or table's aliases with the same name.
ERROR:  column tt.c does not exist
LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
                           ^
Regards

Pavel

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


út 23. 8. 2022 v 3:57 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Mon, Aug 22, 2022 at 09:13:39PM +0200, Pavel Stehule wrote:
> po 22. 8. 2022 v 9:33 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> >
> > - you define new AclMode READ and WRITE.  Those bits are precious and I
> > don't
> >   think it's ok to consume 2 bits for session variables, especially since
> > those
> >   are the last two bits available since the recent GUC access control patch
> >   (ACL_SET and ACL_ALTER_SYSTEM).  Maybe we could existing INSERT and
> > UPDATE
> >   privileges instead, like it's done for sequences?
> >
> >
> I have not a strong opinion about it.  AclMode is uint32 - so I think there
> are still 15bites reserved. I think so UPDATE and SELECT rights can work,
> but maybe it is better to use separate rights WRITE, READ to be stronger
> signalized so the variable is not the relation. On other hand large objects
> use ACL_UPDATE, ACL_SELECT too, and it works. So I am neutral in this
> question. Has somebody here some opinion on this point? If not I'll modify
> the patch like Julien proposes.

Actually no, because AclMode is also used to store the grant option part.  The
comment before AclMode warns about it:

 * The present representation of AclItem limits us to 16 distinct rights,
 * even though AclMode is defined as uint32.  See utils/acl.h.

I missed this. I changed ACL to your proposal in today's patch

Thank you for your corrections.

Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Tue, Aug 23, 2022 at 11:27:45AM +0200, Pavel Stehule wrote:
> út 23. 8. 2022 v 7:56 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> >
> > I've been thinking a bit more about the shadowing, and one scenario we
> > didn't
> > discuss is something like this naive example:
> >
> > CREATE TABLE tt(a text, b text);
> >
> > CREATE TYPE abc AS (a text, b text, c text);
> > CREATE VARIABLE tt AS abc;
> >
> > INSERT INTO tt SELECT 'a', 'b';
> > LET tt = ('x', 'y', 'z');
> >
> > SELECT tt.a, tt.b, tt.c FROM tt;
> >
> > Which, with the default configuration, currently returns
> >
> >  a | b | c
> > ---+---+---
> >  a | b | z
> > (1 row)
> >
> > I feel a bit uncomfortable that the system allows mixing variable
> > attributes
> > and relation columns for the same relation name.  This is even worse here
> > as
> > part of the variable attributes are shadowed.
> >
> > It feels like a good way to write valid queries that clearly won't do what
> > you
> > think they do, a bit like the correlated sub-query trap, so maybe we should
> > have a way to prevent it.
> >
> > What do you think?
> >
>
> I thought about it before. I think valid RTE (but with the missing column)
> can shadow the variable too.
>
> With this change your query fails:
>
> (2022-08-23 11:05:55) postgres=# SELECT tt.a, tt.b, tt.c FROM tt;
> ERROR:  column tt.c does not exist
> LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
>                            ^
> (2022-08-23 11:06:03) postgres=# set session_variables_ambiguity_warning to
> on;
> SET
> (2022-08-23 11:06:19) postgres=# SELECT tt.a, tt.b, tt.c FROM tt;
> WARNING:  session variable "tt.a" is shadowed
> LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
>                ^
> DETAIL:  Session variables can be shadowed by columns, routine's variables
> and routine's arguments with the same name.
> WARNING:  session variable "tt.b" is shadowed
> LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
>                      ^
> DETAIL:  Session variables can be shadowed by columns, routine's variables
> and routine's arguments with the same name.
> WARNING:  session variable "public.tt" is shadowed
> LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
>                            ^
> DETAIL:   Session variables can be shadowed by tables or table's aliases
> with the same name.
> ERROR:  column tt.c does not exist
> LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;

Great, thanks a lot!

Could you add some regression tests for that scenario in the next version,
since this is handled by some new code?  It will also probably be useful to
remind any possible committer about that choice.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi


po 22. 8. 2022 v 9:33 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi Pavel,

On Sun, Aug 21, 2022 at 09:54:03AM +0200, Pavel Stehule wrote:
>
> should be fixed now

I started reviewing the patchset, beginning with 0001 (at least the parts that
don't substantially change later) and have a few comments.

- you define new AclMode READ and WRITE.  Those bits are precious and I don't
  think it's ok to consume 2 bits for session variables, especially since those
  are the last two bits available since the recent GUC access control patch
  (ACL_SET and ACL_ALTER_SYSTEM).  Maybe we could existing INSERT and UPDATE
  privileges instead, like it's done for sequences?

changed - now ACL_SELECT and ACL_UPDATE are used
 

- make check and make-check-world don't pass with this test only.  Given that
  the split is mostly done to ease review and probably not intended to be
  committed this way, we probably shouldn't spend efforts to clean up the split
  apart from making sure that each patch compiles cleanly on its own.  But in
  this case it brought my attention to misc_sanity.sql test.  Looking at patch
  0010, I see:

diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index a57fd142a9..ce9bad7211 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -60,7 +60,9 @@ ORDER BY 1, 2;
  pg_index                | indpred       | pg_node_tree
  pg_largeobject          | data          | bytea
  pg_largeobject_metadata | lomacl        | aclitem[]
-(11 rows)
+ pg_variable             | varacl        | aclitem[]
+ pg_variable             | vardefexpr    | pg_node_tree
+(13 rows)

This is the test for relations with varlena columns without TOAST table.  I
don't think that's correct to add those exceptions, and there should be a TOAST
table declared for pg_variable too, as noted in the comment above that query.

- nitpicking: s/catalogue/catalog/

Some other comments on other patches while testing things around:

fixed
 

- For sessionvariable.c (in 0002), I see that there are still all the comments
  and code about checking type validity based on a generation number and other
  heuristics.  I still fail to understand why this is needed at all as the
  stored datum should remain compatible as long as we prevent the few
  incompatible DDL that are also prevented when there's a relation dependency.
  As an example, I try to quickly disable all that code with the following:

I am not able to test (in this situation) the situation where gennum is increased, but I think it is possible, and there are few situations where dependency is not enough. But maybe my thoughts are too pessimistic, and this aparate is not necessary.

1. update of binary custom type - the dependency allows an extension update, and after update the binary format can be changed. Now I think this part is useless, because although the extension can be updated, the dll cannot be unloaded, so the loaded implementation of custom session type will be the same until session end.

2. altering composite type - the generation number reduces overhead with checking compatibility of stored value and expected value. With gennum I need to run compatibility checks just once per transaction. When the gennum is the same, I can return data without any conversion. 

3. I try to use gennum for detection of oid overflow. The value is stored in the session memory context in the hash table. The related memory can be cleaned at transaction end (when memory is deleted) and when I can read system catalog (transaction is not aborted). When a transaction is aborted, then I cannot read the system catalog, and I have to postpone cleaning to the next usage of the session variable. Theoretically, the session can be inactive for a longer time and the system catalog can be changed a lot (and the oid counter can be restarted). 

I am checking:

3.1 if variable with oid still exists

3.2 if the variable has assigned type with same oid

3.3. if type fingerprint is same - and I can expect so the type with same oid is same type

3.2 and 3.3 are safe guard for cases where oid is restarted, and I cannot believe the consistency of values stored in memory.

This is a very different situation than for example temporary tables. Every temp table for every session has its own entry in the system catalog, so protection based on dependency can work. But record of session variable is shared - It is protected inside transaction, but session variables are living in session. Without transaction there is not any lock on the item in pg_variable, so I can drop the session variable although the value is stored in session memory in some other session. After dropping the related plans are resetted, but the stored value itself stays in memory and can be accessed - if some future variable takes the same oid. With gennum I have 3x checks - that should ensure that the returned value should be always binary valid.

Now, I am thinking about another, maybe more simple identity check, and it should to work and it can less code than solution based on type's fingerprints

I can introduce a 64bit sequence and I can store the value of seq in pg_variable record. Then the identity check can be just savedoid = oid and savedseqnum = seqnum

What do you think about this idea? The overhead of that can be reduced, because for on transaction commit drop or on transaction end reset session variables we don't need it.


 

diff --git a/src/backend/commands/sessionvariable.c b/src/backend/commands/sessionvariable.c
index 9b4f9482a4..7c8808dc46 100644
--- a/src/backend/commands/sessionvariable.c
+++ b/src/backend/commands/sessionvariable.c
@@ -794,6 +794,8 @@ svartype_verify_composite_fast(SVariableType svt)
 static int64
 get_svariable_valid_type_gennum(SVariableType svt)
 {
+   return 1;
+
    HeapTuple   tuple;
    bool        fast_check = true;

@@ -905,6 +907,8 @@ get_svariabletype(Oid typid)
 static bool
 session_variable_use_valid_type(SVariable svar)
 {
+   return true;
+
    Assert(svar);
    Assert(svar->svartype);

And session_variable.sql regression test still works just fine.  Am I missing
something?

the regress test doesn't try to reset oid counter
 

While at it, the initial comment should probably say "free local memory" rather
than "purge memory".

changed
 

- doc are missing for GRANT/REVOKE ... ON ALL VARIABLES

done
 

- plpgsql.sgml:
+   <sect3>
+    <title><command>Session variables and constants</command></title>


rewroted just to "Session variables"

 
I don't think it's ok to use "constant" as an alias for immutable session
variable as immutable session variable can actually be changed.

Similarly, in catalogs.sgml:

+       <structfield>varisimmutable</structfield> <type>boolean</type>
+      </para>
+      <para>
+       True if the variable is immutable (cannot be modified). The default value is false.
+      </para></entry>
+     </row>

I think there should be a note and a link to the corresponding part in
create_variable.sgml to explain what exactly is an immutable variable, as the
implemented behavior (for nullable immutable variable) is somewhat unexpected.

done
 

- other nitpicking: pg_variable and struct Variable seems a bit inconsistent.
  For instance one uses vartype and vartypmod and the other typid and typmod,
  while both use varname and varnamespace.  I think we should avoid discrepancy
  here.

I did it because I needed to rename the namespace field, but the prefix var is not the best. I don't think so using same names like pg_variable in Variable is good idea (due fields like varisnotnull, varisimmutable), but I can the rename varnane and varnamespace to name and namespaceid, what is better than varname, and varnamespace.


Also, there's a sessionvariable.c and a session_variable.h.  Let's use
session_variable.[ch], as it seems more readable?

renamed
 

-typedef patch: missing SVariableTypeData, some commits need a pgindent, e.g:

+typedef SVariableTypeData * SVariableType;

+typedef SVariableData * SVariable;

+static SessionVariableValue * RestoreSessionVariables(char **start_address,
+                                                   int *num_session_variables);

+static Query *transformLetStmt(ParseState *pstate,
+                              LetStmt * stmt);

(and multiple others)

I fixed these.

Thank you for comments

Pavel
 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


út 23. 8. 2022 v 14:57 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Tue, Aug 23, 2022 at 11:27:45AM +0200, Pavel Stehule wrote:
> út 23. 8. 2022 v 7:56 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> >
> > I've been thinking a bit more about the shadowing, and one scenario we
> > didn't
> > discuss is something like this naive example:
> >
> > CREATE TABLE tt(a text, b text);
> >
> > CREATE TYPE abc AS (a text, b text, c text);
> > CREATE VARIABLE tt AS abc;
> >
> > INSERT INTO tt SELECT 'a', 'b';
> > LET tt = ('x', 'y', 'z');
> >
> > SELECT tt.a, tt.b, tt.c FROM tt;
> >
> > Which, with the default configuration, currently returns
> >
> >  a | b | c
> > ---+---+---
> >  a | b | z
> > (1 row)
> >
> > I feel a bit uncomfortable that the system allows mixing variable
> > attributes
> > and relation columns for the same relation name.  This is even worse here
> > as
> > part of the variable attributes are shadowed.
> >
> > It feels like a good way to write valid queries that clearly won't do what
> > you
> > think they do, a bit like the correlated sub-query trap, so maybe we should
> > have a way to prevent it.
> >
> > What do you think?
> >
>
> I thought about it before. I think valid RTE (but with the missing column)
> can shadow the variable too.
>
> With this change your query fails:
>
> (2022-08-23 11:05:55) postgres=# SELECT tt.a, tt.b, tt.c FROM tt;
> ERROR:  column tt.c does not exist
> LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
>                            ^
> (2022-08-23 11:06:03) postgres=# set session_variables_ambiguity_warning to
> on;
> SET
> (2022-08-23 11:06:19) postgres=# SELECT tt.a, tt.b, tt.c FROM tt;
> WARNING:  session variable "tt.a" is shadowed
> LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
>                ^
> DETAIL:  Session variables can be shadowed by columns, routine's variables
> and routine's arguments with the same name.
> WARNING:  session variable "tt.b" is shadowed
> LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
>                      ^
> DETAIL:  Session variables can be shadowed by columns, routine's variables
> and routine's arguments with the same name.
> WARNING:  session variable "public.tt" is shadowed
> LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;
>                            ^
> DETAIL:   Session variables can be shadowed by tables or table's aliases
> with the same name.
> ERROR:  column tt.c does not exist
> LINE 1: SELECT tt.a, tt.b, tt.c FROM tt;

Great, thanks a lot!

Could you add some regression tests for that scenario in the next version,
since this is handled by some new code?  It will also probably be useful to
remind any possible committer about that choice.

it is there

Regards

Pavel
 

Re: Schema variables - new implementation for Postgres 15

From
Erik Rijkers
Date:
Op 24-08-2022 om 08:37 schreef Pavel Stehule:
>>
> 
> I fixed these.
> 

 > [v20220824-1-*.patch]

Hi Pavel,

I noticed just now that variable assignment (i.e., LET) unexpectedly 
(for me anyway) cast the type of the input value. Surely that's wrong? 
The documentation says clearly enough:

'The result must be of the same data type as the session variable.'


Example:

create variable x integer;
let x=1.5;
select x, pg_typeof(x);
  x | pg_typeof
---+-----------
  2 | integer
(1 row)


Is this correct?

If such casts (there are several) are intended then the text of the 
documentation should be changed.

Thanks,

Erik




Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 24. 8. 2022 v 10:04 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
Op 24-08-2022 om 08:37 schreef Pavel Stehule:
>>
>
> I fixed these.
>

 > [v20220824-1-*.patch]

Hi Pavel,

I noticed just now that variable assignment (i.e., LET) unexpectedly
(for me anyway) cast the type of the input value. Surely that's wrong?
The documentation says clearly enough:

'The result must be of the same data type as the session variable.'


Example:

create variable x integer;
let x=1.5;
select x, pg_typeof(x);
  x | pg_typeof
---+-----------
  2 | integer
(1 row)


Is this correct?

If such casts (there are several) are intended then the text of the
documentation should be changed.

yes - the behave is designed like plpgsql assignment or SQL assignment

 (2022-08-25 19:35:35) postgres=# do $$
postgres$# declare i int;
postgres$# begin
postgres$#   i := 1.5;
postgres$#   raise notice '%', i;
postgres$# end;
postgres$# $$;
NOTICE:  2
DO

(2022-08-25 19:38:10) postgres=# create table foo1(a int);
CREATE TABLE
(2022-08-25 19:38:13) postgres=# insert into foo1 values(1.5);
INSERT 0 1
(2022-08-25 19:38:21) postgres=# select * from foo1;
┌───┐
│ a │
╞═══╡
│ 2 │
└───┘
(1 row)

There are the same rules as in SQL.

This sentence is not good - the value should be casteable to the target type.

Regards

Pavel





Thanks,

Erik

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


Hi


- For sessionvariable.c (in 0002), I see that there are still all the comments
  and code about checking type validity based on a generation number and other
  heuristics.  I still fail to understand why this is needed at all as the
  stored datum should remain compatible as long as we prevent the few
  incompatible DDL that are also prevented when there's a relation dependency.
  As an example, I try to quickly disable all that code with the following:



I am sending an alternative implementation based on using own int8 sequence as protection against unwanted oid equation of different session's variables.

This code is much shorter, and, I think better, but now, the creating sequence in bootstrap time is dirty. Maybe instead the sequence can be used 64bite timestamp or some else - it needs a unique combination of oid, 8byte.

Regards

Pavel


 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Sat, Aug 27, 2022 at 01:17:45PM +0200, Pavel Stehule wrote:
>
> after some thinking I think that instead of sequence I can use LSN. The
> combination oid, LSN should be unique forever

Yeah I was about suggesting doing that instead of a sequence, so +1 for that
approach!

I've been spending a bit of time trying to improve the test coverage on the
protection for concurrently deleted and recreated variables, and thought that a
new isolation test should be enough.  I'm attaching a diff (in .txt extension)
that could be applied to 009-regress-tests-for-session-variables.patch, but
while working on that I discovered a few problems.

First, the pg_debug_show_used_session_variables() function reports what's
currently locally known, but there's no guarantee that
AcceptInvalidationMessages() will be called prior to its execution.  For
instance if you're in a transaction and already hold a lock on the function and
execute it again.

It therefore means that it can display that a locally cached variable isn't
dropped and still holds a value, while it's not the case.  While it may be
surprising, I think that's still the wanted behavior as you want to know what
is the cache state.  FTR this is tested in the last permutation in the attached
patch (although the expected output contains the up-to-date information, so you
can see the failure).

But if invalidation are processed when calling the function, the behavior seems
surprising as far as I can see the cleanup seems to be done in 2 steps: mark t
he hash entry as removed and then remove the hash entry.  For instance:

(conn 1) CREATE VARIABLE myvar AS text;
(conn 1) LET myvar = 'something';
(conn 2) DROP VARIABLE myvar;
(conn 1) SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema | name  | removed
--------+-------+---------
 public | myvar | t
(1 row)

(conn 1) SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema | name | removed
--------+------+---------
(0 rows)

Why are two steps necessary here, and is that really wanted?

Finally, I noticed that it's quite easy to get cache lookup failures when using
transactions.  AFAICS it's because the current code first checks in the local
cache (which often isn't immediately invalidated when in a transaction),
returns an oid (of an already dropped variable), then the code acquires a lock
on that non-existent variable, which internally accepts invalidation after the
lock is acquired.  The rest of the code can then fail with some "cache lookup
error" in the various functions as the invalidation has now been processed.
This is also tested in the attached isolation test.

I think that using a retry approach based on SharedInvalidMessageCounter change
detection, like RangeVarGetRelidExtended(), in IdentifyVariable() should be
enough to fix that class of problem, but maybe some other general functions
would need similar protection too.

While looking at the testing, I also noticed that the main regression tests
comments are now outdated since the new (and more permissive) approach for
dropped variable detection.  For instance:

+ ALTER TYPE public.svar_test_type DROP ATTRIBUTE c;
+ -- should to fail
+ SELECT public.svar;
+   svar   
+ ---------
+  (10,20)
+ (1 row)
+ 
+ ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
+ -- should to fail too (different type, different generation number);
+ SELECT public.svar;
+    svar   
+ ----------
+  (10,20,)
+ (1 row)

I'm also unsure if this one is showing a broken behavior or not:

+ CREATE VARIABLE public.avar AS int;
+ -- should to fail
+ SELECT avar FROM xxtab;
+  avar
+ ------
+    10
+ (1 row)
+ 
+ -- should be ok
+ SELECT public.avar FROM xxtab;
+  avar
+ ------
+ 
+ (1 row)


For reference, with the code as-is I get the following diff when testing the
attached isolation test:

--- /Users/rjuju/git/postgresql/src/test/isolation/expected/session-variable.out    2022-08-29 15:41:11.000000000
+0800
+++ /Users/rjuju/git/pg/pgmaster_debug/src/test/isolation/output_iso/results/session-variable.out    2022-08-29
15:42:17.000000000+0800
 
@@ -16,21 +16,21 @@
 step let: LET myvar = 'test';
 step val: SELECT myvar;
 myvar
 -----
 test
 (1 row)

 step s1: BEGIN;
 step drop: DROP VARIABLE myvar;
 step val: SELECT myvar;
-ERROR:  column or variable "myvar" does not exist
+ERROR:  cache lookup failed for session variable 16386
 step sr1: ROLLBACK;

 starting permutation: let val dbg drop create dbg val
 step let: LET myvar = 'test';
 step val: SELECT myvar;
 myvar
 -----
 test
 (1 row)

@@ -68,20 +68,16 @@
 schema|name |removed
 ------+-----+-------
 public|myvar|f
 (1 row)

 step drop: DROP VARIABLE myvar;
 step create: CREATE VARIABLE myvar AS text
 step dbg: SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema|name |removed
 ------+-----+-------
-public|myvar|t
+public|myvar|f
 (1 row)

 step val: SELECT myvar;
-myvar
------
-
-(1 row)
-
+ERROR:  cache lookup failed for session variable 16389
 step sr1: ROLLBACK;


Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 24. 8. 2022 v 10:04 odesílatel Erik Rijkers <er@xs4all.nl> napsal:
Op 24-08-2022 om 08:37 schreef Pavel Stehule:
>>
>
> I fixed these.
>

 > [v20220824-1-*.patch]

Hi Pavel,

I noticed just now that variable assignment (i.e., LET) unexpectedly
(for me anyway) cast the type of the input value. Surely that's wrong?
The documentation says clearly enough:

'The result must be of the same data type as the session variable.'


Example:

create variable x integer;
let x=1.5;
select x, pg_typeof(x);
  x | pg_typeof
---+-----------
  2 | integer
(1 row)


Is this correct?

If such casts (there are several) are intended then the text of the
documentation should be changed.


I changed this

 @@ -58,8 +58,9 @@ LET <replaceable class="parameter">session_variable</replaceable> = DEFAULT
     <term><literal>sql_expression</literal></term>
     <listitem>
      <para>
-      An SQL expression, in parentheses. The result must be of the same data type as the session
-      variable.
+      An SQL expression (can be subquery in parenthesis). The result must
+      be of castable to the same data type as the session variable (in
+      implicit or assignment context).
      </para>
     </listitem>
    </varlistentry>

is it ok?

Regards

Pavel


Thanks,

Erik

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

po 29. 8. 2022 v 11:00 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Sat, Aug 27, 2022 at 01:17:45PM +0200, Pavel Stehule wrote:
>
> after some thinking I think that instead of sequence I can use LSN. The
> combination oid, LSN should be unique forever

Yeah I was about suggesting doing that instead of a sequence, so +1 for that
approach!

I've been spending a bit of time trying to improve the test coverage on the
protection for concurrently deleted and recreated variables, and thought that a
new isolation test should be enough.  I'm attaching a diff (in .txt extension)
that could be applied to 009-regress-tests-for-session-variables.patch, but
while working on that I discovered a few problems.

First, the pg_debug_show_used_session_variables() function reports what's
currently locally known, but there's no guarantee that
AcceptInvalidationMessages() will be called prior to its execution.  For
instance if you're in a transaction and already hold a lock on the function and
execute it again.

It therefore means that it can display that a locally cached variable isn't
dropped and still holds a value, while it's not the case.  While it may be
surprising, I think that's still the wanted behavior as you want to know what
is the cache state.  FTR this is tested in the last permutation in the attached
patch (although the expected output contains the up-to-date information, so you
can see the failure).

But if invalidation are processed when calling the function, the behavior seems
surprising as far as I can see the cleanup seems to be done in 2 steps: mark t
he hash entry as removed and then remove the hash entry.  For instance:

(conn 1) CREATE VARIABLE myvar AS text;
(conn 1) LET myvar = 'something';
(conn 2) DROP VARIABLE myvar;
(conn 1) SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema | name  | removed
--------+-------+---------
 public | myvar | t
(1 row)

(conn 1) SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema | name | removed
--------+------+---------
(0 rows)

Why are two steps necessary here, and is that really wanted?

The value is removed in the first command, but at the end of transaction. pg_debug_show_used_session_variables is called before, and at this moment the variable should be in memory.

I enhanced pg_debug_show_used_session_variables about debug output for start and end, and you can see it.

(2022-08-30 19:38:49) postgres=# set client_min_messages to debug1;
SET
(2022-08-30 19:38:55) postgres=# CREATE VARIABLE myvar AS text;
DEBUG:  record for session variable "myvar" (oid:16390) was created in pg_variable
CREATE VARIABLE
(2022-08-30 19:39:03) postgres=# LET myvar = 'something';
DEBUG:  session variable "public.myvar" (oid:16390) has new entry in memory (emitted by WRITE)
DEBUG:  session variable "public.myvar" (oid:16390) has new value
LET
(2022-08-30 19:39:11) postgres=# SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
DEBUG:  pg_variable_cache_callback 84 2941368844
DEBUG:  session variable "public.myvar" (oid:16390) should be rechecked (forced by sinval)
DEBUG:  pg_debug_show_used_session_variables start
DEBUG:  effective call of sync_sessionvars_all()
DEBUG:  pg_debug_show_used_session_variables end
DEBUG:  session variable "public.myvar" (oid:16390) is removing from memory
┌────────┬───────┬─────────┐
│ schema │ name  │ removed │
╞════════╪═══════╪═════════╡
│ public │ myvar │ t       │
└────────┴───────┴─────────┘
(1 row)

(2022-08-30 19:39:32) postgres=# SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
DEBUG:  pg_debug_show_used_session_variables start
DEBUG:  pg_debug_show_used_session_variables end
┌────────┬──────┬─────────┐
│ schema │ name │ removed │
╞════════╪══════╪═════════╡
└────────┴──────┴─────────┘
(0 rows)


But I missed call sync_sessionvars_all in the drop variable. If I execute this routine there I can fix this behavior and the cleaning in sync_sessionvars_all can be more aggressive.

After change

(2022-08-31 06:25:54) postgres=# let x = 10;
LET
(2022-08-31 06:25:59) postgres=# SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
┌────────┬──────┬─────────┐
│ schema │ name │ removed │
╞════════╪══════╪═════════╡
│ public │ x    │ f       │
└────────┴──────┴─────────┘
(1 row)

-- after drop in other session

(2022-08-31 06:26:00) postgres=# SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
┌────────┬──────┬─────────┐
│ schema │ name │ removed │
╞════════╪══════╪═════════╡
└────────┴──────┴─────────┘
(0 rows)




 

Finally, I noticed that it's quite easy to get cache lookup failures when using
transactions.  AFAICS it's because the current code first checks in the local
cache (which often isn't immediately invalidated when in a transaction),
returns an oid (of an already dropped variable), then the code acquires a lock
on that non-existent variable, which internally accepts invalidation after the
lock is acquired.  The rest of the code can then fail with some "cache lookup
error" in the various functions as the invalidation has now been processed.
This is also tested in the attached isolation test.

I think that using a retry approach based on SharedInvalidMessageCounter change
detection, like RangeVarGetRelidExtended(), in IdentifyVariable() should be
enough to fix that class of problem, but maybe some other general functions
would need similar protection too.

I did it, and with this change it passed the isolation test. Thank you for your important help!

 

While looking at the testing, I also noticed that the main regression tests
comments are now outdated since the new (and more permissive) approach for
dropped variable detection.  For instance:

+ ALTER TYPE public.svar_test_type DROP ATTRIBUTE c;
+ -- should to fail
+ SELECT public.svar;
+   svar   
+ ---------
+  (10,20)
+ (1 row)
+
+ ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
+ -- should to fail too (different type, different generation number);
+ SELECT public.svar;
+    svar   
+ ----------
+  (10,20,)
+ (1 row)


the comments are obsolete, fixed
 

+ CREATE VARIABLE public.avar AS int;
+ -- should to fail
+ SELECT avar FROM xxtab;
+  avar
+ ------
+    10
+ (1 row)
+
+ -- should be ok
+ SELECT public.avar FROM xxtab;
+  avar
+ ------
+
+ (1 row)

fixed
 


For reference, with the code as-is I get the following diff when testing the
attached isolation test:

--- /Users/rjuju/git/postgresql/src/test/isolation/expected/session-variable.out        2022-08-29 15:41:11.000000000 +0800
+++ /Users/rjuju/git/pg/pgmaster_debug/src/test/isolation/output_iso/results/session-variable.out       2022-08-29 15:42:17.000000000 +0800
@@ -16,21 +16,21 @@
 step let: LET myvar = 'test';
 step val: SELECT myvar;
 myvar
 -----
 test
 (1 row)

 step s1: BEGIN;
 step drop: DROP VARIABLE myvar;
 step val: SELECT myvar;
-ERROR:  column or variable "myvar" does not exist
+ERROR:  cache lookup failed for session variable 16386
 step sr1: ROLLBACK;

 starting permutation: let val dbg drop create dbg val
 step let: LET myvar = 'test';
 step val: SELECT myvar;
 myvar
 -----
 test
 (1 row)

@@ -68,20 +68,16 @@
 schema|name |removed
 ------+-----+-------
 public|myvar|f
 (1 row)

 step drop: DROP VARIABLE myvar;
 step create: CREATE VARIABLE myvar AS text
 step dbg: SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema|name |removed
 ------+-----+-------
-public|myvar|t
+public|myvar|f
 (1 row)

 step val: SELECT myvar;
-myvar
------
-
-(1 row)
-
+ERROR:  cache lookup failed for session variable 16389
 step sr1: ROLLBACK;


attached updated patches

Regards

Pavel
 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Fri, Sep 02, 2022 at 07:42:00AM +0200, Pavel Stehule wrote:
>
> rebased today

After some off-list discussion with Pavel, I'm sending some proposal patches
(in .txt extension) to apply to the last patchset.

To sum up, when a session issues a DROP VARIABLE, the session will receive an
sinval notification for its own drop and we don't want to process it
immediately as we need to hold the value in case the transaction is rollbacked.
The current patch avoided that by forcing a single processing of sinval per
transaction, and forcing it before dropping the variable.  It works but it
seems to me that postponing all but the first VARIABLEOID sinval to the next
transaction is a big hammer, and the sooner we can free some memory the better.

For an alternative approach the attached patch store the lxid in the SVariable
itself when dropping a currently set variable, so we can process all sinval and
simply defer to the next transaction the memory cleanup of the variable(s) we
know we just dropped.  What do you think of that approach?

As I was working on some changes I also made a pass on session_variable.c.  I
tried to improve a bit some comments, and also got rid of the "first_time"
variable.  The name wasn't really great, and AFAICS it can be replaced by
testing whether the memory context has been created yet or not.

But once that done I noticed the get_rowtype_value() function.  I don't think
this function is necessary as the core code already knows how to deal with
stored datum when the underlying composite type was modified.  I tried to
bypass that function and always simply return the stored value and all the
tests run fine.  Is there really any cases when this extra code is needed?

FTR I tried to do a bunch of additional testing using relation as base type for
variable, as you can do more with those than plain composite types, but it
still always works just fine.

However, while doing so I noticed that find_composite_type_dependencies()
failed to properly handle dependencies on relation (plain tables, matviews and
partitioned tables).  I'm also adding 2 additional patches to fix this corner
case and add an additional regression test for the plain table case.

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Sat, Sep 03, 2022 at 11:00:51PM +0800, Julien Rouhaud wrote:
> Hi,
>
> On Fri, Sep 02, 2022 at 07:42:00AM +0200, Pavel Stehule wrote:
> >
> > rebased today
>
> After some off-list discussion with Pavel, I'm sending some proposal patches
> (in .txt extension) to apply to the last patchset.
>
> To sum up, when a session issues a DROP VARIABLE, the session will receive an
> sinval notification for its own drop and we don't want to process it
> immediately as we need to hold the value in case the transaction is rollbacked.
> The current patch avoided that by forcing a single processing of sinval per
> transaction, and forcing it before dropping the variable.  It works but it
> seems to me that postponing all but the first VARIABLEOID sinval to the next
> transaction is a big hammer, and the sooner we can free some memory the better.
>
> For an alternative approach the attached patch store the lxid in the SVariable
> itself when dropping a currently set variable, so we can process all sinval and
> simply defer to the next transaction the memory cleanup of the variable(s) we
> know we just dropped.  What do you think of that approach?
>
> As I was working on some changes I also made a pass on session_variable.c.  I
> tried to improve a bit some comments, and also got rid of the "first_time"
> variable.  The name wasn't really great, and AFAICS it can be replaced by
> testing whether the memory context has been created yet or not.
>
> But once that done I noticed the get_rowtype_value() function.  I don't think
> this function is necessary as the core code already knows how to deal with
> stored datum when the underlying composite type was modified.  I tried to
> bypass that function and always simply return the stored value and all the
> tests run fine.  Is there really any cases when this extra code is needed?
>
> FTR I tried to do a bunch of additional testing using relation as base type for
> variable, as you can do more with those than plain composite types, but it
> still always works just fine.
>
> However, while doing so I noticed that find_composite_type_dependencies()
> failed to properly handle dependencies on relation (plain tables, matviews and
> partitioned tables).  I'm also adding 2 additional patches to fix this corner
> case and add an additional regression test for the plain table case.

I forgot to mention this chunk:

+    /*
+     * Although the value of domain type should be valid (it is
+     * checked when it is assigned to session variable), we have to
+     * check related constraints anytime. It can be more expensive
+     * than in PL/pgSQL. PL/pgSQL forces domain checks when value
+     * is assigned to the variable or when value is returned from
+     * function. Fortunately, domain types manage cache of constraints by
+     * self.
+     */
+    if (svar->is_domain)
+    {
+        MemoryContext oldcxt = CurrentMemoryContext;
+
+        /*
+         * Store domain_check extra in CurTransactionContext. When we are
+         * in other transaction, the domain_check_extra cache is not valid.
+         */
+        if (svar->domain_check_extra_lxid != MyProc->lxid)
+            svar->domain_check_extra = NULL;
+
+        domain_check(svar->value, svar->isnull,
+                     svar->typid, &svar->domain_check_extra,
+                     CurTransactionContext);
+
+        svar->domain_check_extra_lxid = MyProc->lxid;
+
+        MemoryContextSwitchTo(oldcxt);
+    }

I agree that storing the domain_check_extra in the transaction context sounds
sensible, but the memory context handling is not quite right.

Looking at domain_check, it doesn't change the current memory context, so as-is
all the code related to oldcxt is unnecessary.

Some other callers like expandedrecord.c do switch to a short lived context to
limit the lifetime of the possible leak by the expression evaluation, but I
don't think that's an option here.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 4. 9. 2022 v 6:31 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Sat, Sep 03, 2022 at 11:00:51PM +0800, Julien Rouhaud wrote:
> Hi,
>
> On Fri, Sep 02, 2022 at 07:42:00AM +0200, Pavel Stehule wrote:
> >
> > rebased today
>
> After some off-list discussion with Pavel, I'm sending some proposal patches
> (in .txt extension) to apply to the last patchset.
>
> To sum up, when a session issues a DROP VARIABLE, the session will receive an
> sinval notification for its own drop and we don't want to process it
> immediately as we need to hold the value in case the transaction is rollbacked.
> The current patch avoided that by forcing a single processing of sinval per
> transaction, and forcing it before dropping the variable.  It works but it
> seems to me that postponing all but the first VARIABLEOID sinval to the next
> transaction is a big hammer, and the sooner we can free some memory the better.
>
> For an alternative approach the attached patch store the lxid in the SVariable
> itself when dropping a currently set variable, so we can process all sinval and
> simply defer to the next transaction the memory cleanup of the variable(s) we
> know we just dropped.  What do you think of that approach?
>
> As I was working on some changes I also made a pass on session_variable.c.  I
> tried to improve a bit some comments, and also got rid of the "first_time"
> variable.  The name wasn't really great, and AFAICS it can be replaced by
> testing whether the memory context has been created yet or not.
>
> But once that done I noticed the get_rowtype_value() function.  I don't think
> this function is necessary as the core code already knows how to deal with
> stored datum when the underlying composite type was modified.  I tried to
> bypass that function and always simply return the stored value and all the
> tests run fine.  Is there really any cases when this extra code is needed?

Yes, it can works because there is not visible difference between NULL and dropped columns, and real number of attributes is saved in HeapTupleHeader

so I removed this function and related code

 
>
> FTR I tried to do a bunch of additional testing using relation as base type for
> variable, as you can do more with those than plain composite types, but it
> still always works just fine.
>
> However, while doing so I noticed that find_composite_type_dependencies()
> failed to properly handle dependencies on relation (plain tables, matviews and
> partitioned tables).  I'm also adding 2 additional patches to fix this corner
> case and add an additional regression test for the plain table case.

I forgot to mention this chunk:

+       /*
+        * Although the value of domain type should be valid (it is
+        * checked when it is assigned to session variable), we have to
+        * check related constraints anytime. It can be more expensive
+        * than in PL/pgSQL. PL/pgSQL forces domain checks when value
+        * is assigned to the variable or when value is returned from
+        * function. Fortunately, domain types manage cache of constraints by
+        * self.
+        */
+       if (svar->is_domain)
+       {
+               MemoryContext oldcxt = CurrentMemoryContext;
+
+               /*
+                * Store domain_check extra in CurTransactionContext. When we are
+                * in other transaction, the domain_check_extra cache is not valid.
+                */
+               if (svar->domain_check_extra_lxid != MyProc->lxid)
+                       svar->domain_check_extra = NULL;
+
+               domain_check(svar->value, svar->isnull,
+                                        svar->typid, &svar->domain_check_extra,
+                                        CurTransactionContext);
+
+               svar->domain_check_extra_lxid = MyProc->lxid;
+
+               MemoryContextSwitchTo(oldcxt);
+       }

I agree that storing the domain_check_extra in the transaction context sounds
sensible, but the memory context handling is not quite right.

Looking at domain_check, it doesn't change the current memory context, so as-is
all the code related to oldcxt is unnecessary.

removed
 

Some other callers like expandedrecord.c do switch to a short lived context to
limit the lifetime of the possible leak by the expression evaluation, but I
don't think that's an option here.

merged your patches, big thanks

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Tue, Sep 06, 2022 at 08:43:59AM +0200, Pavel Stehule wrote:
> Hi
>
> After talking with Julian I removed "debug" fields name and nsname from
> SVariable structure. When it is possible it is better to read these fields
> from catalog without risk of obsoletely or necessity to refresh these
> fields. In other cases we display only oid of variable instead name and
> nsname (It is used just for debug purposes).

Thanks!  I'm just adding back the forgotten Cc list.

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Tue, Sep 06, 2022 at 06:23:12PM +0800, Julien Rouhaud wrote:
> On Tue, Sep 06, 2022 at 08:43:59AM +0200, Pavel Stehule wrote:
> > Hi
> >
> > After talking with Julian I removed "debug" fields name and nsname from
> > SVariable structure. When it is possible it is better to read these fields
> > from catalog without risk of obsoletely or necessity to refresh these
> > fields. In other cases we display only oid of variable instead name and
> > nsname (It is used just for debug purposes).
>
> Thanks!  I'm just adding back the forgotten Cc list.

About the last change:

+static void
+pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)
+{
[...]
+           elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) should be rechecked (forced by sinval)",
+                        get_namespace_name(get_session_variable_namespace(svar->varid)),
+                        get_session_variable_name(svar->varid),
+                        svar->varid);

There's no guarantee that the variable still exists in cache (for variables
dropped in the current transaction), or even that the callback is called while
in a transaction state, so we should only display the oid here.

FTR just to be sure I ran all the new regression tests (with this fix) with CCA
and log_min_messages = DEBUG1 and it didn't hit any problem, so it doesn't seem
that there's any other issue hidden somewhere.


Other than that I don't see any remaining problems in session_variable.c.  I
still have a few nitpicking comments though:

+static SVariable
+prepare_variable_for_reading(Oid varid)
+{
[...]
+            /* Store result before releasing Executor memory */
+            set_session_variable(svar, value, isnull, true);
+
+            MemoryContextSwitchTo(oldcxt);
+
+            FreeExecutorState(estate);

The comment and code is a bit misleading, as it's not immediately obvious that
set_session_variable() doesn't rely on the current memory contex for
allocations.  Simply moving the MemoryContextSwitchTo() before the
set_session_variable() would be better.

+typedef struct SVariableData
+{
[...]
+    bool        is_domain;
+    Oid            basetypeid;
+    void       *domain_check_extra;
+    LocalTransactionId domain_check_extra_lxid;

AFAICS basetypeid isn't needed anymore.


+ /* Both lists hold fields of SVariableXActActionItem type */
+ static List *xact_on_commit_drop_actions = NIL;
+ static List *xact_on_commit_reset_actions = NIL;

Is it possible to merge both in a single list?  I don't think that there's much
to gain trying to separate those.  They shouldn't contain a lot of entries, and
they're usually scanned at the same time anyway.

This is especially important as one of the tricky parts of this patchset is
maintaining those lists across subtransactions, and since both have the same
heuristics all the related code is duplicated.

I see that in AtPreEOXact_SessionVariable_on_xact_actions() both lists are
handled interleaved with the xact_recheck_varids, but I don't see any reason
why we couldn't process both action lists first and then process the rechecks.
I did a quick test and don't see any failure in the regression tests.


+void
+RemoveSessionVariable(Oid varid)
+{

I looks like a layering violation to have (part of) the code for CREATE
VARIABLE in pg_variable.[ch] and the code for DROP VARIABLE in
session_variable.[ch].

I think it was done mostly because it was the initial sync_sessionvars_all()
that was responsible to avoid cleaning up memory for variables dropped in the
current transaction, but that's not a requirement anymore.  So I don't see
anything preventing us from moving RemoveSessionVariable() in pg_variable, and
export some function in session_variable to do the additional work for properly
maintaining the hash table if needed (with that knowledge held in
session_variable, not in pg_variable).  You should only need to pass the oid of
the variable and the eoxaction.

Simlarly, why not move DefineSessionVariable() in pg_variable and expose some
API in session_variable to register the needed SVAR_ON_COMMIT_DROP action?

Also, while not a problem I don't think that the CommandCounterIncrement() is
necessary in DefineSessionVariable().  CREATE VARIABLE is a single operation
and you can't have anything else running in the same ProcessUtility() call.
And since cd3e27464cc you have the guarantee that a CommandCounterIncrement()
will happen at the end of the utility command processing.

While at it, maybe it would be good to add some extra tests in
src/test/modules/test_extensions.  I'm thinking a version 1.0 that creates a
variable and initialize the value (and and extra step after creating the
extension to make sure that the value is really set), and an upgrade to 2.0
that creates a temp variable on commit drop, that has to fail due to the
dependecy on the extension.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 8. 9. 2022 v 9:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Tue, Sep 06, 2022 at 06:23:12PM +0800, Julien Rouhaud wrote:
> On Tue, Sep 06, 2022 at 08:43:59AM +0200, Pavel Stehule wrote:
> > Hi
> >
> > After talking with Julian I removed "debug" fields name and nsname from
> > SVariable structure. When it is possible it is better to read these fields
> > from catalog without risk of obsoletely or necessity to refresh these
> > fields. In other cases we display only oid of variable instead name and
> > nsname (It is used just for debug purposes).
>
> Thanks!  I'm just adding back the forgotten Cc list.

About the last change:

+static void
+pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)
+{
[...]
+           elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) should be rechecked (forced by sinval)",
+                        get_namespace_name(get_session_variable_namespace(svar->varid)),
+                        get_session_variable_name(svar->varid),
+                        svar->varid);


fixed
 
There's no guarantee that the variable still exists in cache (for variables
dropped in the current transaction), or even that the callback is called while
in a transaction state, so we should only display the oid here.

FTR just to be sure I ran all the new regression tests (with this fix) with CCA
and log_min_messages = DEBUG1 and it didn't hit any problem, so it doesn't seem
that there's any other issue hidden somewhere.


Other than that I don't see any remaining problems in session_variable.c.  I
still have a few nitpicking comments though:

+static SVariable
+prepare_variable_for_reading(Oid varid)
+{
[...]
+                       /* Store result before releasing Executor memory */
+                       set_session_variable(svar, value, isnull, true);
+
+                       MemoryContextSwitchTo(oldcxt);
+
+                       FreeExecutorState(estate);

The comment and code is a bit misleading, as it's not immediately obvious that
set_session_variable() doesn't rely on the current memory contex for
allocations.  Simply moving the MemoryContextSwitchTo() before the
set_session_variable() would be better.

changed
 

+typedef struct SVariableData
+{
[...]
+       bool            is_domain;
+       Oid                     basetypeid;
+       void       *domain_check_extra;
+       LocalTransactionId domain_check_extra_lxid;

AFAICS basetypeid isn't needed anymore.


removed
 

+ /* Both lists hold fields of SVariableXActActionItem type */
+ static List *xact_on_commit_drop_actions = NIL;
+ static List *xact_on_commit_reset_actions = NIL;

Is it possible to merge both in a single list?  I don't think that there's much
to gain trying to separate those.  They shouldn't contain a lot of entries, and
they're usually scanned at the same time anyway.

This is especially important as one of the tricky parts of this patchset is
maintaining those lists across subtransactions, and since both have the same
heuristics all the related code is duplicated.

I see that in AtPreEOXact_SessionVariable_on_xact_actions() both lists are
handled interleaved with the xact_recheck_varids, but I don't see any reason
why we couldn't process both action lists first and then process the rechecks.
I did a quick test and don't see any failure in the regression tests.

Originally it was not possible, because there was no xact_reset_varids list, and without this list the processing
ON_COMMIT_DROP started DROP VARIABLE command, and there was a request for ON_COMMIT_RESET action.
Now, it is possible, because in RemoveSessionVariable is conditional execution:

<--><--><-->if (!svar->eox_reset)
<--><--><--><-->register_session_variable_xact_action(varid,
<--><--><--><--><--><--><--><--><--><--><--><--><-->  SVAR_ON_COMMIT_RESET);
<--><-->}

So when we process ON_COMMIT_DROP actions, we know that the reset will not be processed by ON_COMMIT_RESET action,
and then these lists can be merged.

so I merged these two lists to one

 


+void
+RemoveSessionVariable(Oid varid)
+{

I looks like a layering violation to have (part of) the code for CREATE
VARIABLE in pg_variable.[ch] and the code for DROP VARIABLE in
session_variable.[ch].

I think it was done mostly because it was the initial sync_sessionvars_all()
that was responsible to avoid cleaning up memory for variables dropped in the
current transaction, but that's not a requirement anymore.  So I don't see
anything preventing us from moving RemoveSessionVariable() in pg_variable, and
export some function in session_variable to do the additional work for properly
maintaining the hash table if needed (with that knowledge held in
session_variable, not in pg_variable).  You should only need to pass the oid of
the variable and the eoxaction.

I am not sure if the proposed change helps. With it I need to break encapsulation. Now, all implementation details are hidden in session_variable.c.

I understand that the operation Define and Remove are different from operations Set and Get, but all are commands, and all need access to sessionvars and some lists.
 

Simlarly, why not move DefineSessionVariable() in pg_variable and expose some
API in session_variable to register the needed SVAR_ON_COMMIT_DROP action?

Also, while not a problem I don't think that the CommandCounterIncrement() is
necessary in DefineSessionVariable().  CREATE VARIABLE is a single operation
and you can't have anything else running in the same ProcessUtility() call.
And since cd3e27464cc you have the guarantee that a CommandCounterIncrement()
will happen at the end of the utility command processing.

removed
 

While at it, maybe it would be good to add some extra tests in
src/test/modules/test_extensions.  I'm thinking a version 1.0 that creates a
variable and initialize the value (and and extra step after creating the
extension to make sure that the value is really set), and an upgrade to 2.0
that creates a temp variable on commit drop, that has to fail due to the
dependecy on the extension.

In updated patches I replaced used cacheMemoryContext by TopTransactionContext what is more correct (I think)

Regards

Pavel

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Sun, Sep 11, 2022 at 09:29:49PM +0200, Pavel Stehule wrote:
>>
>> Originally it was not possible, because there was no xact_reset_varids list, and without this list the processing
>> ON_COMMIT_DROP started DROP VARIABLE command, and there was a request for ON_COMMIT_RESET action.
>> Now, it is possible, because in RemoveSessionVariable is conditional execution:
>> 
>> <--><--><-->if (!svar->eox_reset)
>> <--><--><--><-->register_session_variable_xact_action(varid,
>> <--><--><--><--><--><--><--><--><--><--><--><--><-->  SVAR_ON_COMMIT_RESET);
>> <--><-->}
>> 
>> So when we process ON_COMMIT_DROP actions, we know that the reset will not be processed by ON_COMMIT_RESET action,
>> and then these lists can be merged.
>> 
>> so I merged these two lists to one

Thanks!  This really helps with code readability, and after looking at it I
found some issues (see below).
>
> changes:
>
> - some minor cleaning
> - refactoring of RemoveSessionVariable  - move part of code to pg_variable.c

Thanks.  I think we could still do more to split what code belongs to
pg_variable.c and session_variable.c.  In my opinion, the various DDL code
should only invoke functions in pg_variable.c, which themselves can call
function in session_variable.c if needed, and session_variable shouldn't know
about CreateSessionVarStmt (which should probably be rename
CreateVariableStmt?) or VariableRelationId.  After an off-list bikeshedding
session with Pavel, we came up with SessionVariableCreatePostprocess() and
SessionVariableDropPostprocess() for the functions in session_variable.c called
by pg_variable.c when handling CREATE VARIABLE and DROP VARIABLE commands.

I'm attaching a new patchset with this change and some more (see below).  I'm
not sending .txt files as this is rebased on top on the recent GUC refactoring
patch.  It won't change the cfbot outcome though, as I also add new regression
tests that are for now failing (see below).  I tried to keep the changes in
extra "FIXUP" patches when possible, but the API changes in the first patch
cause conflicts in the next one, so the big session variable patch has to
contain the needed changes.

In this patchset, I also changed the following:

- global pass on the comments in session_variable
- removed now useless sessionvars_types
- added missing prototypes for static functions (for consistency), and moved
  all the static functions before the static function
- minor other nitpicking / stylistic changes

Here are the problems I found:

- IdentifyVariable()

        /*
         * Lock relation.  This will also accept any pending invalidation
         * messages.  If we got back InvalidOid, indicating not found, then
         * there's nothing to lock, but we accept invalidation messages
         * anyway, to flush any negative catcache entries that may be
         * lingering.
         */
+        if (!OidIsValid(varid))
+            AcceptInvalidationMessages();
+        else if (OidIsValid(varid))
+            LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
+
+        if (inval_count == SharedInvalidMessageCounter)
+            break;
+
+        retry = true;
+        old_varid = varid;
+    }

AFAICS it's correct, but just to be extra cautious I'd explicitly set varid to
InvalidOid before looping, so you restart in the same condition as the first
iteration (since varid is initialize when declared).  Also, the comments should
be modified, it's "Lock variable", not "Lock relation", same for the comment in
the previous chunk ("we've locked the relation  that used to have this
name...").

+Datum
+pg_debug_show_used_session_variables(PG_FUNCTION_ARGS)
+{
+[...]
+            else
+            {
+                /*
+                 * When session variable was removed from catalog, but still
+                 * it in memory. The memory was not purged yet.
+                 */
+                nulls[1] = true;
+                nulls[2] = true;
+                nulls[4] = true;
+                values[5] = BoolGetDatum(true);
+                nulls[6] = true;
+                nulls[7] = true;
+                nulls[8] = true;
+            }

I'm wondering if we could try to improve things a bit here.  Maybe display the
variable oid instead of its name as we still have that information, the type
(using FORMAT_TYPE_ALLOW_INVALID as there's no guarantee that the type would
still exist without the dependency) and whether the variable is valid (at least
per its stored value).  We can keep NULL for the privileges, as there's no API
avoid erroring if the role has been dropped.

+{ oid => '8488', descr => 'debug list of used session variables',
+  proname => 'pg_debug_show_used_session_variables', prorows => '1000', proretset => 't',
+  provolatile => 's', prorettype => 'record', proargtypes => '',
+  proallargtypes => '{oid,text,text,oid,text,bool,bool,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o}',
+  proargnames => '{varid,schema,name,typid,typname,removed,has_value,can_read,can_write}',

Since we change READ / WRITE acl for SELECT / UPDATE, we should rename the
column can_select and can_update.

+static void
+pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)
+{
+ [...]
+    while ((svar = (SVariable) hash_seq_search(&status)) != NULL)
+    {
+        if (hashvalue == 0 || svar->hashvalue == hashvalue)
+        {
+ [...]
+            xact_recheck_varids = list_append_unique_oid(xact_recheck_varids,
+                                                         svar->varid);

This has a pretty terrible complexity.  It can degenerate badly, and there
isn't any CHECK_FOR_INTERRUPTS so you could easily lock a backend for quite
some time.

I think we should just keep appending oids, and do a list_sort(list,
list_oid_cmp) and list_deduplicate_oid(list) before processing the list, in
sync_sessionvars_all() and AtPreEOXact_SessionVariable_on_xact_actions().

Maybe while at it we could reuse sync_sessionvars_all in
AtPreEOXact_SessionVariable_on_xact_actions (with a way to ask
for the lxid check or not), rather than duplicating the whole logic twice?

+/*
+ * Fast drop of the complete content of all session variables hash table.
+ * This is code for DISCARD VARIABLES command. This command
+ * cannot be run inside transaction, so we don't need to handle
+ * end of transaction actions.
+ */
+void
+ResetSessionVariables(void)
+{
+    /* Destroy hash table and reset related memory context */
+    if (sessionvars)
+    {
+        hash_destroy(sessionvars);
+        sessionvars = NULL;
+
+        hash_destroy(sessionvars_types);
+        sessionvars_types = NULL;
+    }
+
+    /* Release memory allocated by session variables */
+    if (SVariableMemoryContext != NULL)
+        MemoryContextReset(SVariableMemoryContext);
+
+    /*
+     * There are not any session variables left, so simply trim xact
+     * action list, and other lists.
+     */
+    list_free_deep(xact_on_commit_actions);
+    xact_on_commit_actions = NIL;
+
+    /* We should clean xact_reset_varids */
+    list_free(xact_reset_varids);
+    xact_reset_varids = NIL;
+
+    /* we should clean xact_recheck_varids */
+    list_free(xact_recheck_varids);
+    xact_recheck_varids = NIL;
+}

The initial comment is wrong.  This function is used for both DISCARD VARIABLES
and DISCARD ALL, but only DISCARD ALL isn't allowed in a transaction (I fixed
the comment in the attached patchset).
We should allow DISCARD VARIABLES in a transaction, therefore it needs some
more thinking on which list can be freed, and in which context it should hold
its data.  AFAICS the only problematic case is ON COMMIT DROP, but an extra
check wouldn't hurt.  For instance:

rjuju=# BEGIN;
BEGIN

rjuju=# CREATE TEMP VARIABLE v AS int ON COMMIT DROP;
CREATE VARIABLE

rjuju=# DISCARD VARIABLES ;
DISCARD VARIABLES

rjuju=# COMMIT;
COMMIT

rjuju=# \dV
                                            List of variables
  Schema   | Name |  Type   | Collation | Nullable | Mutable | Default | Owner | Transactional end action
-----------+------+---------+-----------+----------+---------+---------+-------+--------------------------
 pg_temp_3 | v    | integer |           | t        | t       |         | rjuju | ON COMMIT DROP
(1 row)

Note that I still think that keeping a single List for both SVariableXActAction
helps for readability, even if it means cherry-picking which items should be
removed on DISCARD VARIABLES (which shouldn't be a very frequent operation
anyway).

Also, xact_recheck_varids is allocated in SVariableMemoryContext, so DISCARD
VARIABLE will crash if there's any pending recheck action.

There's only one regression test for DISCARD VARIABLE, which clearly wasn't
enough.  There should be one for the ON COMMIT DROP (which can be added in
normal regression test), one one with all action list populated (that need to
be in isolation tester).  Both are added in the patchset in a suggestion patch,
and for now the first test fails and the second crashes.


- set_session_variable() is documented to either succeed or not change the
  currently set value.  While it's globally true, I see 2 things that could be
  problematic:

  - free_session_variable_value() could technically fail.  However, I don't see
    how it could be happening unless there's a memory corruption, so this would
    result in either an abort, or a backend in a very bad state.  Anyway, since
    pfree() can clearly ereport(ERROR) we should probably do something about
    it.  That being said, I don't really see the point of trying to preserve a
    value that looks like random pointer, which will probably cause a segfault
    the next time it's used.  Maybe add a PG_TRY block around the call and mark
    it as invalid (and set freeval to false) if that happens?

  - the final elog(DEBUG1) can also fail.  It also seems highly unlikely, so
    maybe accept that this exception is ok?  For now I'm adding such a comment
    in a suggestion patch.

- prepare_variable_for_reading() and SetSessionVariable():

+    /* Ensure so all entries in sessionvars hash table are valid */
+    sync_sessionvars_all();
+
+    /* Protect used session variable against drop until transaction end */
+    LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);

It's possible that a session variable is dropped after calling
sync_sessionvars_all(), and we would receive the sinval when acquiring the lock
on VariableRelationId but not process it until the next sync_sessionvars_all
call.  I think we should acquire the lock first and then call
sync_sessionvars_all.  I did that in the suggestion patch.

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Fri, Sep 16, 2022 at 11:59:04AM +0800, Julien Rouhaud wrote:
> Hi,
> 
> On Sun, Sep 11, 2022 at 09:29:49PM +0200, Pavel Stehule wrote:
> >>
> >> Originally it was not possible, because there was no xact_reset_varids list, and without this list the processing
> >> ON_COMMIT_DROP started DROP VARIABLE command, and there was a request for ON_COMMIT_RESET action.
> >> Now, it is possible, because in RemoveSessionVariable is conditional execution:
> >> 
> >> <--><--><-->if (!svar->eox_reset)
> >> <--><--><--><-->register_session_variable_xact_action(varid,
> >> <--><--><--><--><--><--><--><--><--><--><--><--><-->  SVAR_ON_COMMIT_RESET);
> >> <--><-->}
> >> 
> >> So when we process ON_COMMIT_DROP actions, we know that the reset will not be processed by ON_COMMIT_RESET
action,
> >> and then these lists can be merged.
> >> 
> >> so I merged these two lists to one
> 
> Thanks!  This really helps with code readability, and after looking at it I
> found some issues (see below).
> >
> > changes:
> >
> > - some minor cleaning
> > - refactoring of RemoveSessionVariable  - move part of code to pg_variable.c
> 
> Thanks.  I think we could still do more to split what code belongs to
> pg_variable.c and session_variable.c.  In my opinion, the various DDL code
> should only invoke functions in pg_variable.c, which themselves can call
> function in session_variable.c if needed, and session_variable shouldn't know
> about CreateSessionVarStmt (which should probably be rename
> CreateVariableStmt?) or VariableRelationId.  After an off-list bikeshedding
> session with Pavel, we came up with SessionVariableCreatePostprocess() and
> SessionVariableDropPostprocess() for the functions in session_variable.c called
> by pg_variable.c when handling CREATE VARIABLE and DROP VARIABLE commands.
> 
> I'm attaching a new patchset with this change and some more (see below).  I'm
> not sending .txt files as this is rebased on top on the recent GUC refactoring
> patch.  It won't change the cfbot outcome though, as I also add new regression
> tests that are for now failing (see below).  I tried to keep the changes in
> extra "FIXUP" patches when possible, but the API changes in the first patch
> cause conflicts in the next one, so the big session variable patch has to
> contain the needed changes.
> 
> In this patchset, I also changed the following:
> 
> - global pass on the comments in session_variable
> - removed now useless sessionvars_types
> - added missing prototypes for static functions (for consistency), and moved
>   all the static functions before the static function
> - minor other nitpicking / stylistic changes
> 
> Here are the problems I found:
> 
> - IdentifyVariable()
> 
>         /*
>          * Lock relation.  This will also accept any pending invalidation
>          * messages.  If we got back InvalidOid, indicating not found, then
>          * there's nothing to lock, but we accept invalidation messages
>          * anyway, to flush any negative catcache entries that may be
>          * lingering.
>          */
> +        if (!OidIsValid(varid))
> +            AcceptInvalidationMessages();
> +        else if (OidIsValid(varid))
> +            LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
> +
> +        if (inval_count == SharedInvalidMessageCounter)
> +            break;
> +
> +        retry = true;
> +        old_varid = varid;
> +    }
> 
> AFAICS it's correct, but just to be extra cautious I'd explicitly set varid to
> InvalidOid before looping, so you restart in the same condition as the first
> iteration (since varid is initialize when declared).  Also, the comments should
> be modified, it's "Lock variable", not "Lock relation", same for the comment in
> the previous chunk ("we've locked the relation  that used to have this
> name...").
> 
> +Datum
> +pg_debug_show_used_session_variables(PG_FUNCTION_ARGS)
> +{
> +[...]
> +            else
> +            {
> +                /*
> +                 * When session variable was removed from catalog, but still
> +                 * it in memory. The memory was not purged yet.
> +                 */
> +                nulls[1] = true;
> +                nulls[2] = true;
> +                nulls[4] = true;
> +                values[5] = BoolGetDatum(true);
> +                nulls[6] = true;
> +                nulls[7] = true;
> +                nulls[8] = true;
> +            }
> 
> I'm wondering if we could try to improve things a bit here.  Maybe display the
> variable oid instead of its name as we still have that information, the type
> (using FORMAT_TYPE_ALLOW_INVALID as there's no guarantee that the type would
> still exist without the dependency) and whether the variable is valid (at least
> per its stored value).  We can keep NULL for the privileges, as there's no API
> avoid erroring if the role has been dropped.
> 
> +{ oid => '8488', descr => 'debug list of used session variables',
> +  proname => 'pg_debug_show_used_session_variables', prorows => '1000', proretset => 't',
> +  provolatile => 's', prorettype => 'record', proargtypes => '',
> +  proallargtypes => '{oid,text,text,oid,text,bool,bool,bool,bool}',
> +  proargmodes => '{o,o,o,o,o,o,o,o,o}',
> +  proargnames => '{varid,schema,name,typid,typname,removed,has_value,can_read,can_write}',
> 
> Since we change READ / WRITE acl for SELECT / UPDATE, we should rename the
> column can_select and can_update.
> 
> +static void
> +pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)
> +{
> + [...]
> +    while ((svar = (SVariable) hash_seq_search(&status)) != NULL)
> +    {
> +        if (hashvalue == 0 || svar->hashvalue == hashvalue)
> +        {
> + [...]
> +            xact_recheck_varids = list_append_unique_oid(xact_recheck_varids,
> +                                                         svar->varid);
> 
> This has a pretty terrible complexity.  It can degenerate badly, and there
> isn't any CHECK_FOR_INTERRUPTS so you could easily lock a backend for quite
> some time.
> 
> I think we should just keep appending oids, and do a list_sort(list,
> list_oid_cmp) and list_deduplicate_oid(list) before processing the list, in
> sync_sessionvars_all() and AtPreEOXact_SessionVariable_on_xact_actions().
> 
> Maybe while at it we could reuse sync_sessionvars_all in
> AtPreEOXact_SessionVariable_on_xact_actions (with a way to ask
> for the lxid check or not), rather than duplicating the whole logic twice?
> 
> +/*
> + * Fast drop of the complete content of all session variables hash table.
> + * This is code for DISCARD VARIABLES command. This command
> + * cannot be run inside transaction, so we don't need to handle
> + * end of transaction actions.
> + */
> +void
> +ResetSessionVariables(void)
> +{
> +    /* Destroy hash table and reset related memory context */
> +    if (sessionvars)
> +    {
> +        hash_destroy(sessionvars);
> +        sessionvars = NULL;
> +
> +        hash_destroy(sessionvars_types);
> +        sessionvars_types = NULL;
> +    }
> +
> +    /* Release memory allocated by session variables */
> +    if (SVariableMemoryContext != NULL)
> +        MemoryContextReset(SVariableMemoryContext);
> +
> +    /*
> +     * There are not any session variables left, so simply trim xact
> +     * action list, and other lists.
> +     */
> +    list_free_deep(xact_on_commit_actions);
> +    xact_on_commit_actions = NIL;
> +
> +    /* We should clean xact_reset_varids */
> +    list_free(xact_reset_varids);
> +    xact_reset_varids = NIL;
> +
> +    /* we should clean xact_recheck_varids */
> +    list_free(xact_recheck_varids);
> +    xact_recheck_varids = NIL;
> +}
> 
> The initial comment is wrong.  This function is used for both DISCARD VARIABLES
> and DISCARD ALL, but only DISCARD ALL isn't allowed in a transaction (I fixed
> the comment in the attached patchset).
> We should allow DISCARD VARIABLES in a transaction, therefore it needs some
> more thinking on which list can be freed, and in which context it should hold
> its data.  AFAICS the only problematic case is ON COMMIT DROP, but an extra
> check wouldn't hurt.  For instance:
> 
> rjuju=# BEGIN;
> BEGIN
> 
> rjuju=# CREATE TEMP VARIABLE v AS int ON COMMIT DROP;
> CREATE VARIABLE
> 
> rjuju=# DISCARD VARIABLES ;
> DISCARD VARIABLES
> 
> rjuju=# COMMIT;
> COMMIT
> 
> rjuju=# \dV
>                                             List of variables
>   Schema   | Name |  Type   | Collation | Nullable | Mutable | Default | Owner | Transactional end action
> -----------+------+---------+-----------+----------+---------+---------+-------+--------------------------
>  pg_temp_3 | v    | integer |           | t        | t       |         | rjuju | ON COMMIT DROP
> (1 row)
> 
> Note that I still think that keeping a single List for both SVariableXActAction
> helps for readability, even if it means cherry-picking which items should be
> removed on DISCARD VARIABLES (which shouldn't be a very frequent operation
> anyway).
> 
> Also, xact_recheck_varids is allocated in SVariableMemoryContext, so DISCARD
> VARIABLE will crash if there's any pending recheck action.
> 
> There's only one regression test for DISCARD VARIABLE, which clearly wasn't
> enough.  There should be one for the ON COMMIT DROP (which can be added in
> normal regression test), one one with all action list populated (that need to
> be in isolation tester).  Both are added in the patchset in a suggestion patch,
> and for now the first test fails and the second crashes.
> 
> 
> - set_session_variable() is documented to either succeed or not change the
>   currently set value.  While it's globally true, I see 2 things that could be
>   problematic:
> 
>   - free_session_variable_value() could technically fail.  However, I don't see
>     how it could be happening unless there's a memory corruption, so this would
>     result in either an abort, or a backend in a very bad state.  Anyway, since
>     pfree() can clearly ereport(ERROR) we should probably do something about
>     it.  That being said, I don't really see the point of trying to preserve a
>     value that looks like random pointer, which will probably cause a segfault
>     the next time it's used.  Maybe add a PG_TRY block around the call and mark
>     it as invalid (and set freeval to false) if that happens?
> 
>   - the final elog(DEBUG1) can also fail.  It also seems highly unlikely, so
>     maybe accept that this exception is ok?  For now I'm adding such a comment
>     in a suggestion patch.
> 
> - prepare_variable_for_reading() and SetSessionVariable():
> 
> +    /* Ensure so all entries in sessionvars hash table are valid */
> +    sync_sessionvars_all();
> +
> +    /* Protect used session variable against drop until transaction end */
> +    LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
> 
> It's possible that a session variable is dropped after calling
> sync_sessionvars_all(), and we would receive the sinval when acquiring the lock
> on VariableRelationId but not process it until the next sync_sessionvars_all
> call.  I think we should acquire the lock first and then call
> sync_sessionvars_all.  I did that in the suggestion patch.

Rebased patcshet against recent conflicts, thanks to Pavel for the reminder.

While sending a new patch, I realized that I forgot mentionning this in
execMain.c:

@@ -200,6 +201,61 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
     Assert(queryDesc->sourceText != NULL);
     estate->es_sourceText = queryDesc->sourceText;

+    /*
+     * The executor doesn't work with session variables directly. Values of
+     * related session variables are copied to dedicated array, and this array
+     * is passed to executor.
+     */
+    if (queryDesc->num_session_variables > 0)
+    {
+        /*
+         * When paralel access to query parameters (including related session
+         * variables) is required, then related session variables are restored
+         * (deserilized) in queryDesc already. So just push pointer of this
+         * array to executor's estate.
+         */
+        estate->es_session_variables = queryDesc->session_variables;
+        estate->es_num_session_variables = queryDesc->num_session_variables;
+    }
+    else if (queryDesc->plannedstmt->sessionVariables)
+    {
+        ListCell   *lc;
+        int            nSessionVariables;
+        int            i = 0;
+
+        /*
+         * In this case, the query uses session variables, but we have to
+         * prepare the array with passed values (of used session variables)
+         * first.
+         */
+        nSessionVariables = list_length(queryDesc->plannedstmt->sessionVariables);
+
+        /* Create the array used for passing values of used session variables */
+        estate->es_session_variables = (SessionVariableValue *)
+            palloc(nSessionVariables * sizeof(SessionVariableValue));
+
+        /* Fill the array */
+        [...]
+
+        estate->es_num_session_variables = nSessionVariables;
+    }

I haven't looked at that part yet, but the comments are a bit obscure.  IIUC
the first branch is for parallel workers only, if the main backend provided the
array, and the 2nd chunk is for the main backend.  If so, it could be made
clearer, and maybe add an assert about IsParallelWorker() (or
!IsParallelWorker()) as needed?

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

The patch has rotten again, sending an updated version.  Also, after
talking with Pavel, he can't work on this patch before a few days so
I'm adding some extra fixup patches for the things I reported in the
last few emails, so that the cfbot can hopefully turn green.

On Thu, Sep 22, 2022 at 2:41 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Fri, Sep 16, 2022 at 11:59:04AM +0800, Julien Rouhaud wrote:
> > Hi,
> >
> > On Sun, Sep 11, 2022 at 09:29:49PM +0200, Pavel Stehule wrote:
> > >>
> > >> Originally it was not possible, because there was no xact_reset_varids list, and without this list the
processing
> > >> ON_COMMIT_DROP started DROP VARIABLE command, and there was a request for ON_COMMIT_RESET action.
> > >> Now, it is possible, because in RemoveSessionVariable is conditional execution:
> > >>
> > >> <--><--><-->if (!svar->eox_reset)
> > >> <--><--><--><-->register_session_variable_xact_action(varid,
> > >> <--><--><--><--><--><--><--><--><--><--><--><--><-->  SVAR_ON_COMMIT_RESET);
> > >> <--><-->}
> > >>
> > >> So when we process ON_COMMIT_DROP actions, we know that the reset will not be processed by ON_COMMIT_RESET
action,
> > >> and then these lists can be merged.
> > >>
> > >> so I merged these two lists to one
> >
> > Thanks!  This really helps with code readability, and after looking at it I
> > found some issues (see below).
> > >
> > > changes:
> > >
> > > - some minor cleaning
> > > - refactoring of RemoveSessionVariable  - move part of code to pg_variable.c
> >
> > Thanks.  I think we could still do more to split what code belongs to
> > pg_variable.c and session_variable.c.  In my opinion, the various DDL code
> > should only invoke functions in pg_variable.c, which themselves can call
> > function in session_variable.c if needed, and session_variable shouldn't know
> > about CreateSessionVarStmt (which should probably be rename
> > CreateVariableStmt?) or VariableRelationId.  After an off-list bikeshedding
> > session with Pavel, we came up with SessionVariableCreatePostprocess() and
> > SessionVariableDropPostprocess() for the functions in session_variable.c called
> > by pg_variable.c when handling CREATE VARIABLE and DROP VARIABLE commands.
> >
> > I'm attaching a new patchset with this change and some more (see below).  I'm
> > not sending .txt files as this is rebased on top on the recent GUC refactoring
> > patch.  It won't change the cfbot outcome though, as I also add new regression
> > tests that are for now failing (see below).  I tried to keep the changes in
> > extra "FIXUP" patches when possible, but the API changes in the first patch
> > cause conflicts in the next one, so the big session variable patch has to
> > contain the needed changes.
> >
> > In this patchset, I also changed the following:
> >
> > - global pass on the comments in session_variable
> > - removed now useless sessionvars_types
> > - added missing prototypes for static functions (for consistency), and moved
> >   all the static functions before the static function
> > - minor other nitpicking / stylistic changes
> >
> > Here are the problems I found:
> >
> > - IdentifyVariable()
> >
> >               /*
> >                * Lock relation.  This will also accept any pending invalidation
> >                * messages.  If we got back InvalidOid, indicating not found, then
> >                * there's nothing to lock, but we accept invalidation messages
> >                * anyway, to flush any negative catcache entries that may be
> >                * lingering.
> >                */
> > +             if (!OidIsValid(varid))
> > +                     AcceptInvalidationMessages();
> > +             else if (OidIsValid(varid))
> > +                     LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
> > +
> > +             if (inval_count == SharedInvalidMessageCounter)
> > +                     break;
> > +
> > +             retry = true;
> > +             old_varid = varid;
> > +     }
> >
> > AFAICS it's correct, but just to be extra cautious I'd explicitly set varid to
> > InvalidOid before looping, so you restart in the same condition as the first
> > iteration (since varid is initialize when declared).  Also, the comments should
> > be modified, it's "Lock variable", not "Lock relation", same for the comment in
> > the previous chunk ("we've locked the relation  that used to have this
> > name...").
> >
> > +Datum
> > +pg_debug_show_used_session_variables(PG_FUNCTION_ARGS)
> > +{
> > +[...]
> > +                     else
> > +                     {
> > +                             /*
> > +                              * When session variable was removed from catalog, but still
> > +                              * it in memory. The memory was not purged yet.
> > +                              */
> > +                             nulls[1] = true;
> > +                             nulls[2] = true;
> > +                             nulls[4] = true;
> > +                             values[5] = BoolGetDatum(true);
> > +                             nulls[6] = true;
> > +                             nulls[7] = true;
> > +                             nulls[8] = true;
> > +                     }
> >
> > I'm wondering if we could try to improve things a bit here.  Maybe display the
> > variable oid instead of its name as we still have that information, the type
> > (using FORMAT_TYPE_ALLOW_INVALID as there's no guarantee that the type would
> > still exist without the dependency) and whether the variable is valid (at least
> > per its stored value).  We can keep NULL for the privileges, as there's no API
> > avoid erroring if the role has been dropped.
> >
> > +{ oid => '8488', descr => 'debug list of used session variables',
> > +  proname => 'pg_debug_show_used_session_variables', prorows => '1000', proretset => 't',
> > +  provolatile => 's', prorettype => 'record', proargtypes => '',
> > +  proallargtypes => '{oid,text,text,oid,text,bool,bool,bool,bool}',
> > +  proargmodes => '{o,o,o,o,o,o,o,o,o}',
> > +  proargnames => '{varid,schema,name,typid,typname,removed,has_value,can_read,can_write}',
> >
> > Since we change READ / WRITE acl for SELECT / UPDATE, we should rename the
> > column can_select and can_update.
> >
> > +static void
> > +pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)
> > +{
> > + [...]
> > +     while ((svar = (SVariable) hash_seq_search(&status)) != NULL)
> > +     {
> > +             if (hashvalue == 0 || svar->hashvalue == hashvalue)
> > +             {
> > + [...]
> > +                     xact_recheck_varids = list_append_unique_oid(xact_recheck_varids,
> > +
svar->varid);
> >
> > This has a pretty terrible complexity.  It can degenerate badly, and there
> > isn't any CHECK_FOR_INTERRUPTS so you could easily lock a backend for quite
> > some time.
> >
> > I think we should just keep appending oids, and do a list_sort(list,
> > list_oid_cmp) and list_deduplicate_oid(list) before processing the list, in
> > sync_sessionvars_all() and AtPreEOXact_SessionVariable_on_xact_actions().
> >
> > Maybe while at it we could reuse sync_sessionvars_all in
> > AtPreEOXact_SessionVariable_on_xact_actions (with a way to ask
> > for the lxid check or not), rather than duplicating the whole logic twice?
> >
> > +/*
> > + * Fast drop of the complete content of all session variables hash table.
> > + * This is code for DISCARD VARIABLES command. This command
> > + * cannot be run inside transaction, so we don't need to handle
> > + * end of transaction actions.
> > + */
> > +void
> > +ResetSessionVariables(void)
> > +{
> > +     /* Destroy hash table and reset related memory context */
> > +     if (sessionvars)
> > +     {
> > +             hash_destroy(sessionvars);
> > +             sessionvars = NULL;
> > +
> > +             hash_destroy(sessionvars_types);
> > +             sessionvars_types = NULL;
> > +     }
> > +
> > +     /* Release memory allocated by session variables */
> > +     if (SVariableMemoryContext != NULL)
> > +             MemoryContextReset(SVariableMemoryContext);
> > +
> > +     /*
> > +      * There are not any session variables left, so simply trim xact
> > +      * action list, and other lists.
> > +      */
> > +     list_free_deep(xact_on_commit_actions);
> > +     xact_on_commit_actions = NIL;
> > +
> > +     /* We should clean xact_reset_varids */
> > +     list_free(xact_reset_varids);
> > +     xact_reset_varids = NIL;
> > +
> > +     /* we should clean xact_recheck_varids */
> > +     list_free(xact_recheck_varids);
> > +     xact_recheck_varids = NIL;
> > +}
> >
> > The initial comment is wrong.  This function is used for both DISCARD VARIABLES
> > and DISCARD ALL, but only DISCARD ALL isn't allowed in a transaction (I fixed
> > the comment in the attached patchset).
> > We should allow DISCARD VARIABLES in a transaction, therefore it needs some
> > more thinking on which list can be freed, and in which context it should hold
> > its data.  AFAICS the only problematic case is ON COMMIT DROP, but an extra
> > check wouldn't hurt.  For instance:
> >
> > rjuju=# BEGIN;
> > BEGIN
> >
> > rjuju=# CREATE TEMP VARIABLE v AS int ON COMMIT DROP;
> > CREATE VARIABLE
> >
> > rjuju=# DISCARD VARIABLES ;
> > DISCARD VARIABLES
> >
> > rjuju=# COMMIT;
> > COMMIT
> >
> > rjuju=# \dV
> >                                             List of variables
> >   Schema   | Name |  Type   | Collation | Nullable | Mutable | Default | Owner | Transactional end action
> > -----------+------+---------+-----------+----------+---------+---------+-------+--------------------------
> >  pg_temp_3 | v    | integer |           | t        | t       |         | rjuju | ON COMMIT DROP
> > (1 row)
> >
> > Note that I still think that keeping a single List for both SVariableXActAction
> > helps for readability, even if it means cherry-picking which items should be
> > removed on DISCARD VARIABLES (which shouldn't be a very frequent operation
> > anyway).
> >
> > Also, xact_recheck_varids is allocated in SVariableMemoryContext, so DISCARD
> > VARIABLE will crash if there's any pending recheck action.
> >
> > There's only one regression test for DISCARD VARIABLE, which clearly wasn't
> > enough.  There should be one for the ON COMMIT DROP (which can be added in
> > normal regression test), one one with all action list populated (that need to
> > be in isolation tester).  Both are added in the patchset in a suggestion patch,
> > and for now the first test fails and the second crashes.
> >
> >
> > - set_session_variable() is documented to either succeed or not change the
> >   currently set value.  While it's globally true, I see 2 things that could be
> >   problematic:
> >
> >   - free_session_variable_value() could technically fail.  However, I don't see
> >     how it could be happening unless there's a memory corruption, so this would
> >     result in either an abort, or a backend in a very bad state.  Anyway, since
> >     pfree() can clearly ereport(ERROR) we should probably do something about
> >     it.  That being said, I don't really see the point of trying to preserve a
> >     value that looks like random pointer, which will probably cause a segfault
> >     the next time it's used.  Maybe add a PG_TRY block around the call and mark
> >     it as invalid (and set freeval to false) if that happens?
> >
> >   - the final elog(DEBUG1) can also fail.  It also seems highly unlikely, so
> >     maybe accept that this exception is ok?  For now I'm adding such a comment
> >     in a suggestion patch.
> >
> > - prepare_variable_for_reading() and SetSessionVariable():
> >
> > +     /* Ensure so all entries in sessionvars hash table are valid */
> > +     sync_sessionvars_all();
> > +
> > +     /* Protect used session variable against drop until transaction end */
> > +     LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
> >
> > It's possible that a session variable is dropped after calling
> > sync_sessionvars_all(), and we would receive the sinval when acquiring the lock
> > on VariableRelationId but not process it until the next sync_sessionvars_all
> > call.  I think we should acquire the lock first and then call
> > sync_sessionvars_all.  I did that in the suggestion patch.
>
> Rebased patcshet against recent conflicts, thanks to Pavel for the reminder.
>
> While sending a new patch, I realized that I forgot mentionning this in
> execMain.c:
>
> @@ -200,6 +201,61 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
>         Assert(queryDesc->sourceText != NULL);
>         estate->es_sourceText = queryDesc->sourceText;
>
> +       /*
> +        * The executor doesn't work with session variables directly. Values of
> +        * related session variables are copied to dedicated array, and this array
> +        * is passed to executor.
> +        */
> +       if (queryDesc->num_session_variables > 0)
> +       {
> +               /*
> +                * When paralel access to query parameters (including related session
> +                * variables) is required, then related session variables are restored
> +                * (deserilized) in queryDesc already. So just push pointer of this
> +                * array to executor's estate.
> +                */
> +               estate->es_session_variables = queryDesc->session_variables;
> +               estate->es_num_session_variables = queryDesc->num_session_variables;
> +       }
> +       else if (queryDesc->plannedstmt->sessionVariables)
> +       {
> +               ListCell   *lc;
> +               int                     nSessionVariables;
> +               int                     i = 0;
> +
> +               /*
> +                * In this case, the query uses session variables, but we have to
> +                * prepare the array with passed values (of used session variables)
> +                * first.
> +                */
> +               nSessionVariables = list_length(queryDesc->plannedstmt->sessionVariables);
> +
> +               /* Create the array used for passing values of used session variables */
> +               estate->es_session_variables = (SessionVariableValue *)
> +                       palloc(nSessionVariables * sizeof(SessionVariableValue));
> +
> +               /* Fill the array */
> +               [...]
> +
> +               estate->es_num_session_variables = nSessionVariables;
> +       }
>
> I haven't looked at that part yet, but the comments are a bit obscure.  IIUC
> the first branch is for parallel workers only, if the main backend provided the
> array, and the 2nd chunk is for the main backend.  If so, it could be made
> clearer, and maybe add an assert about IsParallelWorker() (or
> !IsParallelWorker()) as needed?

Full list of changes:
  - rebased against multiple conflicts since last version
  - fixed the meson build
  - fixed the ON COMMIT DROP problem and the crash on RESET VARIABLES
  - fixed some copy/pasto in the expected isolation tests (visible now
that it works)
  - added the asserts and tried to clarify the comments for the
session variable handling in QueryDesc (I still haven't really read
that part)
  - did the mentioned modifications on
pg_debug_show_used_session_variables, and used CStringGetTextDatum
macro to simplify the code

Note that while waiting for the CI to finish I noticed that the commit
message for 0001 still mentions the READ/WRITE acl.  The commit
messages will probably need a bit of rewording too once everything
else is fixed, but this one could be changed already.

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

ne 25. 9. 2022 v 8:56 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

The patch has rotten again, sending an updated version.  Also, after
talking with Pavel, he can't work on this patch before a few days so
I'm adding some extra fixup patches for the things I reported in the
last few emails, so that the cfbot can hopefully turn green.

Note that while waiting for the CI to finish I noticed that the commit
message for 0001 still mentions the READ/WRITE acl.  The commit
messages will probably need a bit of rewording too once everything
else is fixed, but this one could be changed already.

I fixed the commit message of 0001 patch. Fixed shadowed variables too.

There is a partially open issue, where I and Julien are not sure about a solution, and we would like to ask for the  community's opinion. I'll send this query in separate mail.

Regards

Pavel

 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

st 12. 10. 2022 v 15:26 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

ne 25. 9. 2022 v 8:56 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

The patch has rotten again, sending an updated version.  Also, after
talking with Pavel, he can't work on this patch before a few days so
I'm adding some extra fixup patches for the things I reported in the
last few emails, so that the cfbot can hopefully turn green.

Note that while waiting for the CI to finish I noticed that the commit
message for 0001 still mentions the READ/WRITE acl.  The commit
messages will probably need a bit of rewording too once everything
else is fixed, but this one could be changed already.

I fixed the commit message of 0001 patch. Fixed shadowed variables too.

There is a partially open issue, where I and Julien are not sure about a solution, and we would like to ask for the  community's opinion. I'll send this query in separate mail.

 rebased with simplified code related to usage of pfree function

Regards

Pavel


Regards

Pavel

 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Thu, Oct 13, 2022 at 07:41:32AM +0200, Pavel Stehule wrote:
>
> > I fixed the commit message of 0001 patch. Fixed shadowed variables too.

Thanks!

> >
> > There is a partially open issue, where I and Julien are not sure about a
> > solution, and we would like to ask for the  community's opinion. I'll send
> > this query in separate mail.
> >
>
>  rebased with simplified code related to usage of pfree function

If anyone is curious the discussion happend at [1].

I looked at the patchset, this time focusing on the LET command.  Here at the
comments I have for now:

- gram.y

@@ -11918,6 +11920,7 @@ ExplainableStmt:
             | CreateMatViewStmt
             | RefreshMatViewStmt
             | ExecuteStmt                    /* by default all are $$=$1 */
+            | LetStmt
         ;

(and other similar places) the comment should be kept to the last statement

Also, having LetStmt as an ExplainableStmt means it's allowed in a CTE:

cte_list:
        common_table_expr                        { $$ = list_make1($1); }
        | cte_list ',' common_table_expr        { $$ = lappend($1, $3); }
        ;

common_table_expr:  name opt_name_list AS opt_materialized '(' PreparableStmt ')' opt_search_clause opt_cycle_clause

And doing so hits this assert in transformWithClause:

        if (!IsA(cte->ctequery, SelectStmt))
        {
            /* must be a data-modifying statement */
            Assert(IsA(cte->ctequery, InsertStmt) ||
                   IsA(cte->ctequery, UpdateStmt) ||
                   IsA(cte->ctequery, DeleteStmt));

            pstate->p_hasModifyingCTE = true;
        }

and I'm assuming it would also fail on this in transformLetStmt:

+    /* There can't be any outer WITH to worry about */
+    Assert(pstate->p_ctenamespace == NIL);

I guess it makes sense to be able to explain a LetStmt (or using it in a
prepared statement), so it should be properly handled in transformSelectStmt.
Also, I don't see any test for a prepared LET statement, this should also be
covered.

- transformLetStmt:

+    varid = IdentifyVariable(names, &attrname, ¬_unique);

It would be nice to have a comment saying that the lock is acquired here

+    /* The grammar should have produced a SELECT */
+    if (!IsA(selectQuery, Query) ||
+        selectQuery->commandType != CMD_SELECT)
+        elog(ERROR, "unexpected non-SELECT command in LET command");

I'm wondering if this should be an Assert instead, as the grammar shouldn't
produce anything else no matter what how hard a user try.

+    /* don't allow multicolumn result */
+    if (list_length(exprList) != 1)
+        ereport(ERROR,
+                (errcode(ERRCODE_SYNTAX_ERROR),
+                 errmsg("expression is not scalar value"),
+                 parser_errposition(pstate,
+                                    exprLocation((Node *) exprList))));

This isn't covered by any regression test and it probably should.  It can be
reached with something like

LET myvar = (null::pg_class).*;

The error message could also use a bit of improvement.

I see that a_expr allows a select statement in parens, but this leads to a
sublink which already has all the required protection to gurantee a single
column, and a single row at most during execution.  This one returns for
non-scalar case:

subquery must return only one column

Maybe use something similar for it, like "expression must return only one
column"?  Similarly the error message in svariableStartupReceiver could be made
more consistent with the related errors:

+        if (++outcols > 1)
+            elog(ERROR, "svariable DestReceiver can take only one attribute");

While on svariableReceiver, I see that the current code assumes that the caller
did everything right.  That's the case right now, but it should still be made
more robust in case future code (or extensions) is added.  I'm thinking:

- svariableState.rows.  Currently not really used, should check that one and
  only one row is received in svariableReceiveSlot and
  svariableShutdownReceiver (if no row is received the variable won't be reset
  which should probably always happen once you setup an svariableReceiver)
- svariableState.typid, typmod and typlen should be double checked with the
  given varid in svariableStartupReceiver.
- svariableState.varid should be initialized with InvalidOid to avoid undefined
  behavior is caller forgets to set it.

I'm also wondering if SetVariableDestReceiverParams() should have an assert
like LockHeldByMe() for the given varid, and maybe an assert that the varid is
a session variable, to avoid running a possibly expensive execution that will
fail when receiving the slot.  I think the function would be better named
SetVariableDestReceiverVarid() or something like that.

+void
+ExecuteLetStmt(ParseState *pstate,
+               LetStmt *stmt,
+               ParamListInfo params,
+               QueryEnvironment *queryEnv,
+               QueryCompletion *qc)
+{
+ [...]
+    /* run the plan to completion */
+    ExecutorRun(queryDesc, ForwardScanDirection, 2L, true);

Why 2 rows?  I'm assuming it's an attempt to detect queries that returns more
than 1 row, but it should be documented.  Note that as mentioned above the dest
receiver currently doesn't check it, so this definitely needs to be fixed.

- IdentifyVariable:

*attrname can be set even is no variable is identified.  I guess that's ok as
it avoids useless code, but it should probably be documented in the function
header.

Also, the API doesn't look ideal.  AFAICS the only reason this function doesn't
error out in case of ambiguous name is that transformColumnRef may check if a
given name shadows a variable when session_variables_ambiguity_warning is set.
But since IdentifyVariable returns InvalidOid if the given list of identifiers
is ambiguous, it seems that the shadow detection can fail to detect a shadowed
reference if multiple variable would shadow the name:

# CREATE TYPE ab AS (a integer, b integer);
CREATE TYPE
# CREATE VARIABLE v_ab AS ab;
CREATE VARIABLE

# CREATE TABLE v_ab (a integer, b integer);
CREATE TABLE

# SET session_variables_ambiguity_warning = 1;
SET

# sELECT v_ab.a FROM v_ab;
WARNING:  42702: session variable "v_ab.a" is shadowed
LINE 1: select v_ab.a from v_ab;
               ^
DETAIL:  Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name.
 a
---
(0 rows)

# CREATE SCHEMA v_ab;
CREATE SCHEMA

# CREATE VARIABLE v_ab.a AS integer;
CREATE VARIABLE

# SELECT v_ab.a FROM v_ab;
 a
---
(0 rows)


Note that a bit later in transformColumnRef(), not_unique is checked only if
the returned varid is valid, which isn't correct as InvalidOid is currently
returned if not_unique is set.

I think that the error should be raised in IdentifyVariable rather than having
every caller check it.  I'm not sure how to perfectly handle the
session_variables_ambiguity_warning though.  Maybe make not_unique optional,
and error out if not_unique is null.  If not null, set it as necessary and
return one of the oid.  The only use would be for shadowing detection, and in
that case it won't be possible to check if a warning can be avoided as it would
be if no amgibuity is found, but that's probably ok.

Or maybe instead LookupVariable should have an extra argument to only match
variable with a composite type if caller asks to.  This would avoid scenarios
like:

CREATE VARIABLE myvar AS int;
SELECT myvar.blabla;
ERROR:  42809: type integer is not composite

Is that really ok to match a variable here rather than complaining about a
missing FROM-clause?

+    indirection_start = list_length(names) - (attrname ? 1 : 0);
+    indirection = list_copy_tail(stmt->target, indirection_start);
+ [...]
+        if (indirection != NULL)
+        {
+            bool        targetIsArray;
+            char       *targetName;
+
+            targetName = get_session_variable_name(varid);
+            targetIsArray = OidIsValid(get_element_type(typid));
+
+            pstate->p_hasSessionVariables = true;
+
+            coerced_expr = (Expr *)
+                transformAssignmentIndirection(pstate,
+                                               (Node *) param,
+                                               targetName,
+                                               targetIsArray,
+                                               typid,
+                                               typmod,
+                                               InvalidOid,
+                                               indirection,
+                                               list_head(indirection),
+                                               (Node *) expr,
+                                               COERCION_PLPGSQL,
+                                               stmt->location);
+        }

I'm not sure why you use this approach rather than just having something like
"ListCell *indirection_head", set it to a non-NULL value when needed, and use
that (with names) instead.  Note that it's also not correct to compare a List
to NULL, use NIL instead.

- expr_kind_allows_session_variables

Even if that's a bit annoying, I think it's better to explicitly put all values
there rather than having a default clause.

For instance, EXPR_KIND_CYCLE_MARK is currently allowing session variables,
which doesn't look ok.  It's probably just an error from when the patchset was
rebased, but this probably wouldn't happen if you get an error for an unmatched
value if you add a new expr kind (which doesn't happen that often).

[1] https://www.postgresql.org/message-id/CAFj8pRB2+pVBFsidS-AzhHdZid40OTUspWfXS0vgahHmaWosZQ@mail.gmail.com



Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Thu, Oct 13, 2022 at 07:41:32AM +0200, Pavel Stehule wrote:
>  rebased with simplified code related to usage of pfree function

Thanks for the patch, great work!

I've got a couple of questions, although I haven't fully finished reviewing yet
(so more to come):

* I'm curious about ALTER VARIABLE. Current implementation allows altering only
  the name, schema or the owner -- why not e.g. immutability?

* psql tab completion implementation mentions that CREATE VARIABLE could be
  used inside CREATE SCHEMA:

    /* CREATE VARIABLE --- is allowed inside CREATE SCHEMA, so use TailMatches */
    /* Complete CREATE VARIABLE <name> with AS */
    else if (TailMatches("IMMUTABLE"))

  Is that correct? It doesn't like it works, and from what I see it requires
  some modifications in transformCreateSchemaStmt and schema_stmt.

* psql describe mentions the following:

    /*
     * Most functions in this file are content to print an empty table when
     * there are no matching objects.  We intentionally deviate from that
     * here, but only in !quiet mode, for historical reasons.
     */

  I guess it's taken from listTables, and the extended versions says "because
  of the possibility that the user is confused about what the two pattern
  arguments mean". Are those historical reasons apply to variables as well?



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

ne 30. 10. 2022 v 19:05 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Thu, Oct 13, 2022 at 07:41:32AM +0200, Pavel Stehule wrote:
>  rebased with simplified code related to usage of pfree function

Thanks for the patch, great work!

I've got a couple of questions, although I haven't fully finished reviewing yet
(so more to come):

* I'm curious about ALTER VARIABLE. Current implementation allows altering only
  the name, schema or the owner -- why not e.g. immutability?

It is just in category - "not implemented yet". The name, schema or owner doesn't change behavior. It can be possible (in next versions) to change type, default expression, immutability (I think). But the patch is long enough so I prefer just to support basic generic ALTER related to schema, and other possibilities to implement in next iterations.
 

* psql tab completion implementation mentions that CREATE VARIABLE could be
  used inside CREATE SCHEMA:

    /* CREATE VARIABLE --- is allowed inside CREATE SCHEMA, so use TailMatches */
    /* Complete CREATE VARIABLE <name> with AS */
    else if (TailMatches("IMMUTABLE"))

  Is that correct? It doesn't like it works, and from what I see it requires
  some modifications in transformCreateSchemaStmt and schema_stmt.

yes,

This is a bug. It should be fixed

 

* psql describe mentions the following:

        /*
         * Most functions in this file are content to print an empty table when
         * there are no matching objects.  We intentionally deviate from that
         * here, but only in !quiet mode, for historical reasons.
         */

  I guess it's taken from listTables, and the extended versions says "because
  of the possibility that the user is confused about what the two pattern
  arguments mean". Are those historical reasons apply to variables as well?

The behave is same like the tables

(2022-10-30 19:48:14) postgres=# \dt
Did not find any relations.
(2022-10-30 19:48:16) postgres=# \dV
Did not find any session variables.

Thank you for comments

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

po 17. 10. 2022 v 5:17 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Thu, Oct 13, 2022 at 07:41:32AM +0200, Pavel Stehule wrote:
>
> > I fixed the commit message of 0001 patch. Fixed shadowed variables too.

Thanks!

> >
> > There is a partially open issue, where I and Julien are not sure about a
> > solution, and we would like to ask for the  community's opinion. I'll send
> > this query in separate mail.
> >
>
>  rebased with simplified code related to usage of pfree function

If anyone is curious the discussion happend at [1].

I looked at the patchset, this time focusing on the LET command.  Here at the
comments I have for now:

- gram.y

@@ -11918,6 +11920,7 @@ ExplainableStmt:
                        | CreateMatViewStmt
                        | RefreshMatViewStmt
                        | ExecuteStmt                                   /* by default all are $$=$1 */
+                       | LetStmt
                ;

(and other similar places) the comment should be kept to the last statement

fixed
 

Also, having LetStmt as an ExplainableStmt means it's allowed in a CTE:

cte_list:
                common_table_expr                                               { $$ = list_make1($1); }
                | cte_list ',' common_table_expr                { $$ = lappend($1, $3); }
                ;

common_table_expr:  name opt_name_list AS opt_materialized '(' PreparableStmt ')' opt_search_clause opt_cycle_clause

And doing so hits this assert in transformWithClause:

                if (!IsA(cte->ctequery, SelectStmt))
                {
                        /* must be a data-modifying statement */
                        Assert(IsA(cte->ctequery, InsertStmt) ||
                                   IsA(cte->ctequery, UpdateStmt) ||
                                   IsA(cte->ctequery, DeleteStmt));

                        pstate->p_hasModifyingCTE = true;
                }

and I'm assuming it would also fail on this in transformLetStmt:

+       /* There can't be any outer WITH to worry about */
+       Assert(pstate->p_ctenamespace == NIL);

I guess it makes sense to be able to explain a LetStmt (or using it in a
prepared statement), so it should be properly handled in transformSelectStmt.
Also, I don't see any test for a prepared LET statement, this should also be
covered.

The LET statement doesn't return data, so it should be disallowed similar like  MERGE statement

I enhanced the regression test about PREPARE of the LET statement. I found and fix the missing plan dependency of target variable of LET command



- transformLetStmt:

+       varid = IdentifyVariable(names, &attrname, &not_unique);

It would be nice to have a comment saying that the lock is acquired here

done
 

+       /* The grammar should have produced a SELECT */
+       if (!IsA(selectQuery, Query) ||
+               selectQuery->commandType != CMD_SELECT)
+               elog(ERROR, "unexpected non-SELECT command in LET command");

I'm wondering if this should be an Assert instead, as the grammar shouldn't
produce anything else no matter what how hard a user try.

done
 

+       /* don't allow multicolumn result */
+       if (list_length(exprList) != 1)
+               ereport(ERROR,
+                               (errcode(ERRCODE_SYNTAX_ERROR),
+                                errmsg("expression is not scalar value"),
+                                parser_errposition(pstate,
+                                                                       exprLocation((Node *) exprList))));

This isn't covered by any regression test and it probably should.  It can be
reached with something like

LET myvar = (null::pg_class).*;

The error message could also use a bit of improvement.

done - the error message is like related plpgsql error message
 

I see that a_expr allows a select statement in parens, but this leads to a
sublink which already has all the required protection to gurantee a single
column, and a single row at most during execution.  This one returns for
non-scalar case:

subquery must return only one column

Maybe use something similar for it, like "expression must return only one
column"?  Similarly the error message in svariableStartupReceiver could be made
more consistent with the related errors:

+               if (++outcols > 1)
+                       elog(ERROR, "svariable DestReceiver can take only one attribute");

done
 

While on svariableReceiver, I see that the current code assumes that the caller
did everything right.  That's the case right now, but it should still be made
more robust in case future code (or extensions) is added.  I'm thinking:

- svariableState.rows.  Currently not really used, should check that one and
  only one row is received in svariableReceiveSlot and
  svariableShutdownReceiver (if no row is received the variable won't be reset
  which should probably always happen once you setup an svariableReceiver)

done
 
- svariableState.typid, typmod and typlen should be double checked with the
  given varid in svariableStartupReceiver.

done
 
- svariableState.varid should be initialized with InvalidOid to avoid undefined
  behavior is caller forgets to set it.

svariableState is initialized by palloc0
 

I'm also wondering if SetVariableDestReceiverParams() should have an assert
like LockHeldByMe() for the given varid,

done
 
and maybe an assert that the varid is
a session variable, to avoid running a possibly expensive execution that will

done
 
fail when receiving the slot.  I think the function would be better named
SetVariableDestReceiverVarid() or something like that.

done
 

+void
+ExecuteLetStmt(ParseState *pstate,
+                          LetStmt *stmt,
+                          ParamListInfo params,
+                          QueryEnvironment *queryEnv,
+                          QueryCompletion *qc)
+{
+ [...]
+       /* run the plan to completion */
+       ExecutorRun(queryDesc, ForwardScanDirection, 2L, true);

Why 2 rows?  I'm assuming it's an attempt to detect queries that returns more
than 1 row, but it should be documented.  Note that as mentioned above the dest
receiver currently doesn't check it, so this definitely needs to be fixed.

done + check + tests

 

- IdentifyVariable:

*attrname can be set even is no variable is identified.  I guess that's ok as
it avoids useless code, but it should probably be documented in the function
header.

This is a side effect. The attrname is used only when the returned oid is valid. I checked code, and
I extended comments on the function.

I am sending updated patch, next points I'll process tomorrow



 

Also, the API doesn't look ideal.  AFAICS the only reason this function doesn't
error out in case of ambiguous name is that transformColumnRef may check if a
given name shadows a variable when session_variables_ambiguity_warning is set.
But since IdentifyVariable returns InvalidOid if the given list of identifiers
is ambiguous, it seems that the shadow detection can fail to detect a shadowed
reference if multiple variable would shadow the name:

# CREATE TYPE ab AS (a integer, b integer);
CREATE TYPE
# CREATE VARIABLE v_ab AS ab;
CREATE VARIABLE

# CREATE TABLE v_ab (a integer, b integer);
CREATE TABLE

# SET session_variables_ambiguity_warning = 1;
SET

# sELECT v_ab.a FROM v_ab;
WARNING:  42702: session variable "v_ab.a" is shadowed
LINE 1: select v_ab.a from v_ab;
               ^
DETAIL:  Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name.
 a
---
(0 rows)

# CREATE SCHEMA v_ab;
CREATE SCHEMA

# CREATE VARIABLE v_ab.a AS integer;
CREATE VARIABLE

# SELECT v_ab.a FROM v_ab;
 a
---
(0 rows)


Note that a bit later in transformColumnRef(), not_unique is checked only if
the returned varid is valid, which isn't correct as InvalidOid is currently
returned if not_unique is set.

I think that the error should be raised in IdentifyVariable rather than having
every caller check it.  I'm not sure how to perfectly handle the
session_variables_ambiguity_warning though.  Maybe make not_unique optional,
and error out if not_unique is null.  If not null, set it as necessary and
return one of the oid.  The only use would be for shadowing detection, and in
that case it won't be possible to check if a warning can be avoided as it would
be if no amgibuity is found, but that's probably ok.

Or maybe instead LookupVariable should have an extra argument to only match
variable with a composite type if caller asks to.  This would avoid scenarios
like:

CREATE VARIABLE myvar AS int;
SELECT myvar.blabla;
ERROR:  42809: type integer is not composite

Is that really ok to match a variable here rather than complaining about a
missing FROM-clause?

+       indirection_start = list_length(names) - (attrname ? 1 : 0);
+       indirection = list_copy_tail(stmt->target, indirection_start);
+ [...]
+               if (indirection != NULL)
+               {
+                       bool            targetIsArray;
+                       char       *targetName;
+
+                       targetName = get_session_variable_name(varid);
+                       targetIsArray = OidIsValid(get_element_type(typid));
+
+                       pstate->p_hasSessionVariables = true;
+
+                       coerced_expr = (Expr *)
+                               transformAssignmentIndirection(pstate,
+                                                                                          (Node *) param,
+                                                                                          targetName,
+                                                                                          targetIsArray,
+                                                                                          typid,
+                                                                                          typmod,
+                                                                                          InvalidOid,
+                                                                                          indirection,
+                                                                                          list_head(indirection),
+                                                                                          (Node *) expr,
+                                                                                          COERCION_PLPGSQL,
+                                                                                          stmt->location);
+               }

I'm not sure why you use this approach rather than just having something like
"ListCell *indirection_head", set it to a non-NULL value when needed, and use
that (with names) instead.  Note that it's also not correct to compare a List
to NULL, use NIL instead.

- expr_kind_allows_session_variables

Even if that's a bit annoying, I think it's better to explicitly put all values
there rather than having a default clause.

For instance, EXPR_KIND_CYCLE_MARK is currently allowing session variables,
which doesn't look ok.  It's probably just an error from when the patchset was
rebased, but this probably wouldn't happen if you get an error for an unmatched
value if you add a new expr kind (which doesn't happen that often).

+ fixed issue reported by Dmitry Dolgov

[1] https://www.postgresql.org/message-id/CAFj8pRB2+pVBFsidS-AzhHdZid40OTUspWfXS0vgahHmaWosZQ@mail.gmail.com
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi


Also, the API doesn't look ideal.  AFAICS the only reason this function doesn't
error out in case of ambiguous name is that transformColumnRef may check if a
given name shadows a variable when session_variables_ambiguity_warning is set.
But since IdentifyVariable returns InvalidOid if the given list of identifiers
is ambiguous, it seems that the shadow detection can fail to detect a shadowed
reference if multiple variable would shadow the name:

# CREATE TYPE ab AS (a integer, b integer);
CREATE TYPE
# CREATE VARIABLE v_ab AS ab;
CREATE VARIABLE

# CREATE TABLE v_ab (a integer, b integer);
CREATE TABLE

# SET session_variables_ambiguity_warning = 1;
SET

# sELECT v_ab.a FROM v_ab;
WARNING:  42702: session variable "v_ab.a" is shadowed
LINE 1: select v_ab.a from v_ab;
               ^
DETAIL:  Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name.
 a
---
(0 rows)

# CREATE SCHEMA v_ab;
CREATE SCHEMA

# CREATE VARIABLE v_ab.a AS integer;
CREATE VARIABLE

# SELECT v_ab.a FROM v_ab;
 a
---
(0 rows)


Note that a bit later in transformColumnRef(), not_unique is checked only if
the returned varid is valid, which isn't correct as InvalidOid is currently
returned if not_unique is set.

I think that the error should be raised in IdentifyVariable rather than having
every caller check it.  I'm not sure how to perfectly handle the
session_variables_ambiguity_warning though.  Maybe make not_unique optional,
and error out if not_unique is null.  If not null, set it as necessary and
return one of the oid.  The only use would be for shadowing detection, and in
that case it won't be possible to check if a warning can be avoided as it would
be if no amgibuity is found, but that's probably ok.

done

I partially rewrote the IdentifyVariable routine. Now it should be robust.

 

Or maybe instead LookupVariable should have an extra argument to only match
variable with a composite type if caller asks to.  This would avoid scenarios
like:

CREATE VARIABLE myvar AS int;
SELECT myvar.blabla;
ERROR:  42809: type integer is not composite

Is that really ok to match a variable here rather than complaining about a
missing FROM-clause?

I feel "missing FROM-clause" is a little bit better, although the message "type integer is not composite" is correct too. But there is agreement so implementation of session variables should minimize impacts on PostgreSQL behaviour, and it is more comfortant with some filtering used in other places.

 

+       indirection_start = list_length(names) - (attrname ? 1 : 0);
+       indirection = list_copy_tail(stmt->target, indirection_start);
+ [...]
+               if (indirection != NULL)
+               {
+                       bool            targetIsArray;
+                       char       *targetName;
+
+                       targetName = get_session_variable_name(varid);
+                       targetIsArray = OidIsValid(get_element_type(typid));
+
+                       pstate->p_hasSessionVariables = true;
+
+                       coerced_expr = (Expr *)
+                               transformAssignmentIndirection(pstate,
+                                                                                          (Node *) param,
+                                                                                          targetName,
+                                                                                          targetIsArray,
+                                                                                          typid,
+                                                                                          typmod,
+                                                                                          InvalidOid,
+                                                                                          indirection,
+                                                                                          list_head(indirection),
+                                                                                          (Node *) expr,
+                                                                                          COERCION_PLPGSQL,
+                                                                                          stmt->location);
+               }

I'm not sure why you use this approach rather than just having something like
"ListCell *indirection_head", set it to a non-NULL value when needed, and use
that (with names) instead.  Note that it's also not correct to compare a List
to NULL, use NIL instead.

changed, fixed
 

- expr_kind_allows_session_variables

Even if that's a bit annoying, I think it's better to explicitly put all values
there rather than having a default clause.

For instance, EXPR_KIND_CYCLE_MARK is currently allowing session variables,
which doesn't look ok.  It's probably just an error from when the patchset was
rebased, but this probably wouldn't happen if you get an error for an unmatched
value if you add a new expr kind (which doesn't happen that often).

done

updated patch assigned

Regards

Pavel
 

[1] https://www.postgresql.org/message-id/CAFj8pRB2+pVBFsidS-AzhHdZid40OTUspWfXS0vgahHmaWosZQ@mail.gmail.com
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Fri, Nov 04, 2022 at 05:58:06AM +0100, Pavel Stehule wrote:
> Hi
>
> fix clang warning

I've stumbled upon something that looks weird to me (inspired by the
example from tests):

    =# create variable v2 as int;
    =# let v2 = 3;
    =# create view vv2 as select coalesce(v2, 0) + 1000 as result

    =# select * from vv2;
     result
     --------
        1003

    =# set force_parallel_mode to on;
    =# select * from vv2;
     result
     --------
        1000

In the second select the actual work is done from a worker backend.
Since values of session variables are stored in the backend local
memory, it's not being shared with the worker and the value is not found
in the hash map. Does this suppose to be like that?



Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Fri, Nov 04, 2022 at 03:07:48PM +0100, Dmitry Dolgov wrote:
> > On Fri, Nov 04, 2022 at 05:58:06AM +0100, Pavel Stehule wrote:
> > Hi
> >
> > fix clang warning
>
> I've stumbled upon something that looks weird to me (inspired by the
> example from tests):
>
>     =# create variable v2 as int;
>     =# let v2 = 3;
>     =# create view vv2 as select coalesce(v2, 0) + 1000 as result
>
>     =# select * from vv2;
>      result
>      --------
>         1003
>
>     =# set force_parallel_mode to on;
>     =# select * from vv2;
>      result
>      --------
>         1000
>
> In the second select the actual work is done from a worker backend.
> Since values of session variables are stored in the backend local
> memory, it's not being shared with the worker and the value is not found
> in the hash map. Does this suppose to be like that?

There's code to serialize and restore all used variables for parallel workers
(see code about PARAM_VARIABLE and queryDesc->num_session_variables /
queryDesc->plannedstmt->sessionVariables).  I haven't reviewed that part yet,
but it's supposed to be working.  Blind guess would be that it's missing
something in expression walker.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi


pá 4. 11. 2022 v 15:08 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Fri, Nov 04, 2022 at 05:58:06AM +0100, Pavel Stehule wrote:
> Hi
>
> fix clang warning

I've stumbled upon something that looks weird to me (inspired by the
example from tests):

    =# create variable v2 as int;
    =# let v2 = 3;
    =# create view vv2 as select coalesce(v2, 0) + 1000 as result

    =# select * from vv2;
     result
     --------
        1003

    =# set force_parallel_mode to on;
    =# select * from vv2;
     result
     --------
        1000

In the second select the actual work is done from a worker backend.
Since values of session variables are stored in the backend local
memory, it's not being shared with the worker and the value is not found
in the hash map. Does this suppose to be like that?

It looks like a bug, but parallel queries should be supported.

The value of the variable is passed as parameter to the worker backend. But probably somewhere the original reference was not replaced by parameter
 

Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Fri, Nov 04, 2022 at 03:17:18PM +0100, Pavel Stehule wrote:
> > I've stumbled upon something that looks weird to me (inspired by the
> > example from tests):
> >
> >     =# create variable v2 as int;
> >     =# let v2 = 3;
> >     =# create view vv2 as select coalesce(v2, 0) + 1000 as result
> >
> >     =# select * from vv2;
> >      result
> >      --------
> >         1003
> >
> >     =# set force_parallel_mode to on;
> >     =# select * from vv2;
> >      result
> >      --------
> >         1000
> >
> > In the second select the actual work is done from a worker backend.
> > Since values of session variables are stored in the backend local
> > memory, it's not being shared with the worker and the value is not found
> > in the hash map. Does this suppose to be like that?
> >
>
> It looks like a bug, but parallel queries should be supported.
>
> The value of the variable is passed as parameter to the worker backend. But
> probably somewhere the original reference was not replaced by parameter
>
> On Fri, Nov 04, 2022 at 10:17:13PM +0800, Julien Rouhaud wrote:
> Hi,
>
> There's code to serialize and restore all used variables for parallel workers
> (see code about PARAM_VARIABLE and queryDesc->num_session_variables /
> queryDesc->plannedstmt->sessionVariables).  I haven't reviewed that part yet,
> but it's supposed to be working.  Blind guess would be that it's missing
> something in expression walker.

I see, thanks. I'll take a deeper look, my initial assumption was due to
the fact that in the worker case create_sessionvars_hashtables is
getting called for every query.



Re: Schema variables - new implementation for Postgres 15

From
Tomas Vondra
Date:
Hi,

I did a quick initial review of this patch series - attached is a
version with "review" commits for some of the parts. The current patch
seems in pretty good shape, most of what I noticed are minor issues. I
plan to do a more thorough review later.

A quick overview of the issues:

0001
----

- AtPreEOXact_SessionVariable_on_xact_actions name seems unnecessarily
complicated and redundant, and mismatching nearby functions. Why not
call it AtEOXact_SessionVariable, similar to AtEOXact_LargeObject?

- some whitespace / ordering cleanup

- I'm not sure why find_composite_type_dependencies needs the extra
"else if" branch (instead of just doing "if" as before)

- NamesFromList and IdentifyVariable seem introduced unnecessarily
early, as they are only used in 0002 and 0003 parts (in the original
patch series). Not sure if the plan is to squash everything into a
single patch, or commit individual patches.

- AFAIK patches don't need to modify typedefs.list.


0002
----

- some whitespace / ordering cleanup

- moving setting hasSessionVariables right after similar fields

- SessionVariableCreatePostprocess prototype is redundant (2x)

- I'd probably rename pg_debug_show_used_session_variables to
pg_session_variables (assuming we want to keep this view)


0003
----

- I'd rename svariableState to SVariableState, to keep the naming
consistent with other similar/related typedefs.

- some whitespace / ordering cleanup


0007
----

- minor wording change


Aside from that, I tried running this under valgrind, and that produces
this report:

==250595== Conditional jump or move depends on uninitialised value(s)
==250595==    at 0x731A48: sync_sessionvars_all (session_variable.c:513)
==250595==    by 0x7321A6: prepare_variable_for_reading
(session_variable.c:727)
==250595==    by 0x7320BA: CopySessionVariable (session_variable.c:898)
==250595==    by 0x7BC3BF: standard_ExecutorStart (execMain.c:252)
==250595==    by 0x7BC042: ExecutorStart (execMain.c:146)
==250595==    by 0xA89283: PortalStart (pquery.c:520)
==250595==    by 0xA84E8D: exec_simple_query (postgres.c:1199)
==250595==    by 0xA8425B: PostgresMain (postgres.c:4551)
==250595==    by 0x998B03: BackendRun (postmaster.c:4482)
==250595==    by 0x9980EC: BackendStartup (postmaster.c:4210)
==250595==    by 0x996F0D: ServerLoop (postmaster.c:1804)
==250595==    by 0x9948CA: PostmasterMain (postmaster.c:1476)
==250595==    by 0x8526B6: main (main.c:197)
==250595==  Uninitialised value was created by a heap allocation
==250595==    at 0xCD86F0: MemoryContextAllocExtended (mcxt.c:1138)
==250595==    by 0xC9FA1F: DynaHashAlloc (dynahash.c:292)
==250595==    by 0xC9FEC1: element_alloc (dynahash.c:1715)
==250595==    by 0xCA102A: get_hash_entry (dynahash.c:1324)
==250595==    by 0xCA0879: hash_search_with_hash_value (dynahash.c:1097)
==250595==    by 0xCA0432: hash_search (dynahash.c:958)
==250595==    by 0x731614: SetSessionVariable (session_variable.c:846)
==250595==    by 0x82FEED: svariableReceiveSlot (svariableReceiver.c:138)
==250595==    by 0x7BD277: ExecutePlan (execMain.c:1726)
==250595==    by 0x7BD0C5: standard_ExecutorRun (execMain.c:422)
==250595==    by 0x7BCE63: ExecutorRun (execMain.c:366)
==250595==    by 0x7332F0: ExecuteLetStmt (session_variable.c:1310)
==250595==    by 0xA8CC15: standard_ProcessUtility (utility.c:1076)
==250595==    by 0xA8BC72: ProcessUtility (utility.c:533)
==250595==    by 0xA8B2B9: PortalRunUtility (pquery.c:1161)
==250595==    by 0xA8A454: PortalRunMulti (pquery.c:1318)
==250595==    by 0xA89A16: PortalRun (pquery.c:794)
==250595==    by 0xA84F9E: exec_simple_query (postgres.c:1238)
==250595==    by 0xA8425B: PostgresMain (postgres.c:4551)
==250595==    by 0x998B03: BackendRun (postmaster.c:4482)
==250595==

Which I think means this:

    if (filter_lxid && svar->drop_lxid == MyProc->lxid)
        continue;

accesses drop_lxid, which was not initialized in init_session_variable.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Sat, Nov 05, 2022 at 05:04:31PM +0100, Tomas Vondra wrote:
>
> I did a quick initial review of this patch series - attached is a
> version with "review" commits for some of the parts. The current patch
> seems in pretty good shape, most of what I noticed are minor issues. I
> plan to do a more thorough review later.

Thanks!

I agree with all of your comments, just a few answers below

> - NamesFromList and IdentifyVariable seem introduced unnecessarily
> early, as they are only used in 0002 and 0003 parts (in the original
> patch series). Not sure if the plan is to squash everything into a
> single patch, or commit individual patches.

The split was mostly done to make the patch easier to review, as it adds quite
a bit of infrastructure.

There have been some previous comments to have a more logical separation and
fix similar issues, but there are still probably other oddities like that
laying around.  I personally didn't focus much on it as I don't know if the
future committer will choose to squash everything or not.

> - AFAIK patches don't need to modify typedefs.list.

I think this was discussed a year or so ago, and my understanding is that the
general rule is that it's now welcome, if not recommended, to maintain
typedefs.list in each patchset.

> Which I think means this:
>
>     if (filter_lxid && svar->drop_lxid == MyProc->lxid)
>         continue;
>
> accesses drop_lxid, which was not initialized in init_session_variable.

Agreed.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 4. 11. 2022 v 15:28 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Fri, Nov 04, 2022 at 03:17:18PM +0100, Pavel Stehule wrote:
> > I've stumbled upon something that looks weird to me (inspired by the
> > example from tests):
> >
> >     =# create variable v2 as int;
> >     =# let v2 = 3;
> >     =# create view vv2 as select coalesce(v2, 0) + 1000 as result
> >
> >     =# select * from vv2;
> >      result
> >      --------
> >         1003
> >
> >     =# set force_parallel_mode to on;
> >     =# select * from vv2;
> >      result
> >      --------
> >         1000
> >
> > In the second select the actual work is done from a worker backend.
> > Since values of session variables are stored in the backend local
> > memory, it's not being shared with the worker and the value is not found
> > in the hash map. Does this suppose to be like that?
> >
>
> It looks like a bug, but parallel queries should be supported.
>
> The value of the variable is passed as parameter to the worker backend. But
> probably somewhere the original reference was not replaced by parameter
>
> On Fri, Nov 04, 2022 at 10:17:13PM +0800, Julien Rouhaud wrote:
> Hi,
>
> There's code to serialize and restore all used variables for parallel workers
> (see code about PARAM_VARIABLE and queryDesc->num_session_variables /
> queryDesc->plannedstmt->sessionVariables).  I haven't reviewed that part yet,
> but it's supposed to be working.  Blind guess would be that it's missing
> something in expression walker.

I see, thanks. I'll take a deeper look, my initial assumption was due to
the fact that in the worker case create_sessionvars_hashtables is
getting called for every query.

It should be fixed in today's patch

The problem was in missing pushing the hasSessionVariables flag to the upper subquery in pull_up_simple_subquery.

--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -1275,6 +1275,9 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
    /* If subquery had any RLS conditions, now main query does too */
    parse->hasRowSecurity |= subquery->hasRowSecurity;
 
+   /* If subquery had session variables, now main query does too */
+   parse->hasSessionVariables |= subquery->hasSessionVariables;
+

Thank you for the check and bug report. Your example was added to regress tests

Regards

Pavel


Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 5. 11. 2022 v 17:04 odesílatel Tomas Vondra <tomas.vondra@enterprisedb.com> napsal:
Hi,

I did a quick initial review of this patch series - attached is a
version with "review" commits for some of the parts. The current patch
seems in pretty good shape, most of what I noticed are minor issues. I
plan to do a more thorough review later.

A quick overview of the issues:

0001
----

- AtPreEOXact_SessionVariable_on_xact_actions name seems unnecessarily
complicated and redundant, and mismatching nearby functions. Why not
call it AtEOXact_SessionVariable, similar to AtEOXact_LargeObject?

renamed
 

- some whitespace / ordering cleanup

- I'm not sure why find_composite_type_dependencies needs the extra
"else if" branch (instead of just doing "if" as before)

yes, it was not necessary


- NamesFromList and IdentifyVariable seem introduced unnecessarily
early, as they are only used in 0002 and 0003 parts (in the original
patch series). Not sure if the plan is to squash everything into a
single patch, or commit individual patches.

moved to 0002
 

- AFAIK patches don't need to modify typedefs.list.


0002
----

- some whitespace / ordering cleanup

- moving setting hasSessionVariables right after similar fields

fixed
 

- SessionVariableCreatePostprocess prototype is redundant (2x)

removed
 

- I'd probably rename pg_debug_show_used_session_variables to
pg_session_variables (assuming we want to keep this view)

renamed
 


0003
----

- I'd rename svariableState to SVariableState, to keep the naming
consistent with other similar/related typedefs.

renamed


- some whitespace / ordering cleanup


0007
----

- minor wording change

fixed


Aside from that, I tried running this under valgrind, and that produces
this report:

==250595== Conditional jump or move depends on uninitialised value(s)
==250595==    at 0x731A48: sync_sessionvars_all (session_variable.c:513)
==250595==    by 0x7321A6: prepare_variable_for_reading
(session_variable.c:727)
==250595==    by 0x7320BA: CopySessionVariable (session_variable.c:898)
==250595==    by 0x7BC3BF: standard_ExecutorStart (execMain.c:252)
==250595==    by 0x7BC042: ExecutorStart (execMain.c:146)
==250595==    by 0xA89283: PortalStart (pquery.c:520)
==250595==    by 0xA84E8D: exec_simple_query (postgres.c:1199)
==250595==    by 0xA8425B: PostgresMain (postgres.c:4551)
==250595==    by 0x998B03: BackendRun (postmaster.c:4482)
==250595==    by 0x9980EC: BackendStartup (postmaster.c:4210)
==250595==    by 0x996F0D: ServerLoop (postmaster.c:1804)
==250595==    by 0x9948CA: PostmasterMain (postmaster.c:1476)
==250595==    by 0x8526B6: main (main.c:197)
==250595==  Uninitialised value was created by a heap allocation
==250595==    at 0xCD86F0: MemoryContextAllocExtended (mcxt.c:1138)
==250595==    by 0xC9FA1F: DynaHashAlloc (dynahash.c:292)
==250595==    by 0xC9FEC1: element_alloc (dynahash.c:1715)
==250595==    by 0xCA102A: get_hash_entry (dynahash.c:1324)
==250595==    by 0xCA0879: hash_search_with_hash_value (dynahash.c:1097)
==250595==    by 0xCA0432: hash_search (dynahash.c:958)
==250595==    by 0x731614: SetSessionVariable (session_variable.c:846)
==250595==    by 0x82FEED: svariableReceiveSlot (svariableReceiver.c:138)
==250595==    by 0x7BD277: ExecutePlan (execMain.c:1726)
==250595==    by 0x7BD0C5: standard_ExecutorRun (execMain.c:422)
==250595==    by 0x7BCE63: ExecutorRun (execMain.c:366)
==250595==    by 0x7332F0: ExecuteLetStmt (session_variable.c:1310)
==250595==    by 0xA8CC15: standard_ProcessUtility (utility.c:1076)
==250595==    by 0xA8BC72: ProcessUtility (utility.c:533)
==250595==    by 0xA8B2B9: PortalRunUtility (pquery.c:1161)
==250595==    by 0xA8A454: PortalRunMulti (pquery.c:1318)
==250595==    by 0xA89A16: PortalRun (pquery.c:794)
==250595==    by 0xA84F9E: exec_simple_query (postgres.c:1238)
==250595==    by 0xA8425B: PostgresMain (postgres.c:4551)
==250595==    by 0x998B03: BackendRun (postmaster.c:4482)
==250595==

Which I think means this:

    if (filter_lxid && svar->drop_lxid == MyProc->lxid)
        continue;

accesses drop_lxid, which was not initialized in init_session_variable.

fixed
 
Thank you very much for this review.

Today's patch should solve all issues reported by Tomas.

Regards

Pavel




regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Sergey Shinderuk
Date:
On 13.11.2022 20:59, Pavel Stehule wrote:
> fresh rebase

Hello,

Sorry, I haven't been following this thread, but I'd like to report a 
memory management bug. I couldn't apply the latest patches, so I tested 
with v20221104-1-* patches applied atop of commit b0284bfb1db.


postgres=# create variable s text default 'abc';

create function f() returns text as $$
begin
         return g(s);
end;
$$ language plpgsql;

create function g(t text) returns text as $$
begin
         let s = 'BOOM!';
         return t;
end;
$$ language plpgsql;

select f();
CREATE VARIABLE
CREATE FUNCTION
CREATE FUNCTION
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

LOG:  server process (PID 55307) was terminated by signal 11: 
Segmentation fault
DETAIL:  Failed process was running: select f();


I believe it's a use-after-free error, triggered by assigning a new 
value to s in g(), thus making t a dangling pointer.

After reconnecting I get a scary error:

postgres=# select f();
ERROR:  compressed pglz data is corrupt


Best regards,

-- 
Sergey Shinderuk        https://postgrespro.com/




Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


po 14. 11. 2022 v 8:00 odesílatel Sergey Shinderuk <s.shinderuk@postgrespro.ru> napsal:
On 13.11.2022 20:59, Pavel Stehule wrote:
> fresh rebase

Hello,

Sorry, I haven't been following this thread, but I'd like to report a
memory management bug. I couldn't apply the latest patches, so I tested
with v20221104-1-* patches applied atop of commit b0284bfb1db.


postgres=# create variable s text default 'abc';

create function f() returns text as $$
begin
         return g(s);
end;
$$ language plpgsql;

create function g(t text) returns text as $$
begin
         let s = 'BOOM!';
         return t;
end;
$$ language plpgsql;

select f();
CREATE VARIABLE
CREATE FUNCTION
CREATE FUNCTION
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

LOG:  server process (PID 55307) was terminated by signal 11:
Segmentation fault
DETAIL:  Failed process was running: select f();


I believe it's a use-after-free error, triggered by assigning a new
value to s in g(), thus making t a dangling pointer.

After reconnecting I get a scary error:

postgres=# select f();
ERROR:  compressed pglz data is corrupt

I am able to reproduce it, and I have a quick fix, but I need to investigate i this fix will be correct

It's a good example so I have to always return a copy of value.

Regards

Pavel






Best regards,

--
Sergey Shinderuk                https://postgrespro.com/

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

po 14. 11. 2022 v 8:00 odesílatel Sergey Shinderuk <s.shinderuk@postgrespro.ru> napsal:
On 13.11.2022 20:59, Pavel Stehule wrote:
> fresh rebase

Hello,

Sorry, I haven't been following this thread, but I'd like to report a
memory management bug. I couldn't apply the latest patches, so I tested
with v20221104-1-* patches applied atop of commit b0284bfb1db.


postgres=# create variable s text default 'abc';

create function f() returns text as $$
begin
         return g(s);
end;
$$ language plpgsql;

create function g(t text) returns text as $$
begin
         let s = 'BOOM!';
         return t;
end;
$$ language plpgsql;

select f();
CREATE VARIABLE
CREATE FUNCTION
CREATE FUNCTION
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

LOG:  server process (PID 55307) was terminated by signal 11:
Segmentation fault
DETAIL:  Failed process was running: select f();

should be fixed now

Thank you for check

Regards

Pavel
 


I believe it's a use-after-free error, triggered by assigning a new
value to s in g(), thus making t a dangling pointer.

After reconnecting I get a scary error:

postgres=# select f();
ERROR:  compressed pglz data is corrupt


Best regards,

--
Sergey Shinderuk                https://postgrespro.com/

Attachment

Re: Schema variables - new implementation for Postgres 15 (typo)

From
Erik Rijkers
Date:
Op 14-12-2022 om 05:54 schreef Pavel Stehule:
> Hi
> 
> fresh rebase

typo alert:

v20221214-0003-LET-command.patch contains

errmsg("target session varible is of type %s"

('varible' should be 'variable')

Erik



Re: Schema variables - new implementation for Postgres 15 (typo)

From
Dmitry Dolgov
Date:
Hi,

I'm continuing review the patch slowly, and have one more issue plus one
philosophical question.

The issue have something to do with variables invalidation. Enabling
debug_discard_caches = 1 (about which I've learned from this thread) and
running this subset of the test suite:

    CREATE SCHEMA svartest;
    SET search_path = svartest;
    CREATE VARIABLE var3 AS int;

    CREATE OR REPLACE FUNCTION inc(int)
    RETURNS int AS $$
    BEGIN
      LET svartest.var3 = COALESCE(svartest.var3 + $1, $1);
      RETURN var3;
    END;
    $$ LANGUAGE plpgsql;

    SELECT inc(1);
    SELECT inc(1);
    SELECT inc(1);

crashes in my setup like this:

    #2  0x0000000000b432d4 in ExceptionalCondition (conditionName=0xce9b99 "n >= 0 && n < list->length",
fileName=0xce9c18"list.c", lineNumber=770) at assert.c:66
 
    #3  0x00000000007d3acd in list_delete_nth_cell (list=0x18ab248, n=-3388) at list.c:770
    #4  0x00000000007d3b88 in list_delete_cell (list=0x18ab248, cell=0x18dc028) at list.c:842
    #5  0x00000000006bcb52 in sync_sessionvars_all (filter_lxid=true) at session_variable.c:524
    #6  0x00000000006bd4cb in SetSessionVariable (varid=16386, value=2, isNull=false) at session_variable.c:844
    #7  0x00000000006bd617 in SetSessionVariableWithSecurityCheck (varid=16386, value=2, isNull=false) at
session_variable.c:885
    #8  0x00007f763b890698 in exec_stmt_let (estate=0x7ffcc6fd5190, stmt=0x18aa920) at pl_exec.c:5030
    #9  0x00007f763b88a746 in exec_stmts (estate=0x7ffcc6fd5190, stmts=0x18aaaa0) at pl_exec.c:2116
    #10 0x00007f763b88a247 in exec_stmt_block (estate=0x7ffcc6fd5190, block=0x18aabf8) at pl_exec.c:1935
    #11 0x00007f763b889a49 in exec_toplevel_block (estate=0x7ffcc6fd5190, block=0x18aabf8) at pl_exec.c:1626
    #12 0x00007f763b8879df in plpgsql_exec_function (func=0x18781b0, fcinfo=0x18be110, simple_eval_estate=0x0,
simple_eval_resowner=0x0,procedure_resowner=0x0, atomic=true) at pl_exec.c:615
 
    #13 0x00007f763b8a2320 in plpgsql_call_handler (fcinfo=0x18be110) at pl_handler.c:277
    #14 0x0000000000721716 in ExecInterpExpr (state=0x18bde28, econtext=0x18bd3d0, isnull=0x7ffcc6fd56d7) at
execExprInterp.c:730
    #15 0x0000000000723642 in ExecInterpExprStillValid (state=0x18bde28, econtext=0x18bd3d0, isNull=0x7ffcc6fd56d7) at
execExprInterp.c:1855
    #16 0x000000000077a78b in ExecEvalExprSwitchContext (state=0x18bde28, econtext=0x18bd3d0, isNull=0x7ffcc6fd56d7) at
../../../src/include/executor/executor.h:344
    #17 0x000000000077a7f4 in ExecProject (projInfo=0x18bde20) at ../../../src/include/executor/executor.h:378
    #18 0x000000000077a9dc in ExecResult (pstate=0x18bd2c0) at nodeResult.c:136
    #19 0x0000000000738ea0 in ExecProcNodeFirst (node=0x18bd2c0) at execProcnode.c:464
    #20 0x000000000072c6e3 in ExecProcNode (node=0x18bd2c0) at ../../../src/include/executor/executor.h:262
    #21 0x000000000072f426 in ExecutePlan (estate=0x18bd098, planstate=0x18bd2c0, use_parallel_mode=false,
operation=CMD_SELECT,sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0x18b3eb8,
execute_once=true)at execMain.c:1691
 
    #22 0x000000000072cf76 in standard_ExecutorRun (queryDesc=0x189c688, direction=ForwardScanDirection, count=0,
execute_once=true)at execMain.c:423
 
    #23 0x000000000072cdb3 in ExecutorRun (queryDesc=0x189c688, direction=ForwardScanDirection, count=0,
execute_once=true)at execMain.c:367
 
    #24 0x000000000099afdc in PortalRunSelect (portal=0x1866648, forward=true, count=0, dest=0x18b3eb8) at
pquery.c:927
    #25 0x000000000099ac99 in PortalRun (portal=0x1866648, count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x18b3eb8,altdest=0x18b3eb8, qc=0x7ffcc6fd5a70) at pquery.c:771
 
    #26 0x000000000099487d in exec_simple_query (query_string=0x17edcc8 "SELECT inc(1);") at postgres.c:1238

It seems that sync_sessionvars_all tries to remove a cell that either doesn't
belong to the xact_recheck_varids or weird in some other way:

    +>>> p l - xact_recheck_varids->elements
    $81 = -3388

The second thing I wanted to ask about is a more strategical question. Does
anyone have clear understanding where this patch is going? The thread is quite
large, and it's hard to catch up with all the details -- it would be great if
someone could summarize the current state of things, are there any outstanding
design issues or not addressed concerns?

From the first look it seems some major topics the discussion is evolving are about:

* Validity of the use case. Seems to be quite convincingly addressed in [1] and
[2].

* Complicated logic around invalidation, concurrent create/drop etc. (I guess
the issue above is falling into the same category).

* Concerns that session variables could repeat some problems of temporary tables.

Is there anything else?

[1]: https://www.postgresql.org/message-id/CAFj8pRBT-bRQJBqkzon7tHcoFZ1byng06peZfZa0G72z46YFxg%40mail.gmail.com
[2]:
https://www.postgresql.org/message-id/flat/CAFj8pRBHSAHdainq5tRhN2Nns62h9-SZi0pvNq9DTe0VM6M1%3Dg%40mail.gmail.com#4faccb978d60e9b0b5d920e1a8a05bbf



Re: Schema variables - new implementation for Postgres 15 (typo)

From
Pavel Stehule
Date:


čt 22. 12. 2022 v 17:15 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
Hi,

I'm continuing review the patch slowly, and have one more issue plus one
philosophical question.

The issue have something to do with variables invalidation. Enabling
debug_discard_caches = 1 (about which I've learned from this thread) and
running this subset of the test suite:

        CREATE SCHEMA svartest;
        SET search_path = svartest;
        CREATE VARIABLE var3 AS int;

        CREATE OR REPLACE FUNCTION inc(int)
        RETURNS int AS $$
        BEGIN
          LET svartest.var3 = COALESCE(svartest.var3 + $1, $1);
          RETURN var3;
        END;
        $$ LANGUAGE plpgsql;

        SELECT inc(1);
        SELECT inc(1);
        SELECT inc(1);

crashes in my setup like this:

        #2  0x0000000000b432d4 in ExceptionalCondition (conditionName=0xce9b99 "n >= 0 && n < list->length", fileName=0xce9c18 "list.c", lineNumber=770) at assert.c:66
        #3  0x00000000007d3acd in list_delete_nth_cell (list=0x18ab248, n=-3388) at list.c:770
        #4  0x00000000007d3b88 in list_delete_cell (list=0x18ab248, cell=0x18dc028) at list.c:842
        #5  0x00000000006bcb52 in sync_sessionvars_all (filter_lxid=true) at session_variable.c:524
        #6  0x00000000006bd4cb in SetSessionVariable (varid=16386, value=2, isNull=false) at session_variable.c:844
        #7  0x00000000006bd617 in SetSessionVariableWithSecurityCheck (varid=16386, value=2, isNull=false) at session_variable.c:885
        #8  0x00007f763b890698 in exec_stmt_let (estate=0x7ffcc6fd5190, stmt=0x18aa920) at pl_exec.c:5030
        #9  0x00007f763b88a746 in exec_stmts (estate=0x7ffcc6fd5190, stmts=0x18aaaa0) at pl_exec.c:2116
        #10 0x00007f763b88a247 in exec_stmt_block (estate=0x7ffcc6fd5190, block=0x18aabf8) at pl_exec.c:1935
        #11 0x00007f763b889a49 in exec_toplevel_block (estate=0x7ffcc6fd5190, block=0x18aabf8) at pl_exec.c:1626
        #12 0x00007f763b8879df in plpgsql_exec_function (func=0x18781b0, fcinfo=0x18be110, simple_eval_estate=0x0, simple_eval_resowner=0x0, procedure_resowner=0x0, atomic=true) at pl_exec.c:615
        #13 0x00007f763b8a2320 in plpgsql_call_handler (fcinfo=0x18be110) at pl_handler.c:277
        #14 0x0000000000721716 in ExecInterpExpr (state=0x18bde28, econtext=0x18bd3d0, isnull=0x7ffcc6fd56d7) at execExprInterp.c:730
        #15 0x0000000000723642 in ExecInterpExprStillValid (state=0x18bde28, econtext=0x18bd3d0, isNull=0x7ffcc6fd56d7) at execExprInterp.c:1855
        #16 0x000000000077a78b in ExecEvalExprSwitchContext (state=0x18bde28, econtext=0x18bd3d0, isNull=0x7ffcc6fd56d7) at ../../../src/include/executor/executor.h:344
        #17 0x000000000077a7f4 in ExecProject (projInfo=0x18bde20) at ../../../src/include/executor/executor.h:378
        #18 0x000000000077a9dc in ExecResult (pstate=0x18bd2c0) at nodeResult.c:136
        #19 0x0000000000738ea0 in ExecProcNodeFirst (node=0x18bd2c0) at execProcnode.c:464
        #20 0x000000000072c6e3 in ExecProcNode (node=0x18bd2c0) at ../../../src/include/executor/executor.h:262
        #21 0x000000000072f426 in ExecutePlan (estate=0x18bd098, planstate=0x18bd2c0, use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0x18b3eb8, execute_once=true) at execMain.c:1691
        #22 0x000000000072cf76 in standard_ExecutorRun (queryDesc=0x189c688, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:423
        #23 0x000000000072cdb3 in ExecutorRun (queryDesc=0x189c688, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:367
        #24 0x000000000099afdc in PortalRunSelect (portal=0x1866648, forward=true, count=0, dest=0x18b3eb8) at pquery.c:927
        #25 0x000000000099ac99 in PortalRun (portal=0x1866648, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x18b3eb8, altdest=0x18b3eb8, qc=0x7ffcc6fd5a70) at pquery.c:771
        #26 0x000000000099487d in exec_simple_query (query_string=0x17edcc8 "SELECT inc(1);") at postgres.c:1238

It seems that sync_sessionvars_all tries to remove a cell that either doesn't
belong to the xact_recheck_varids or weird in some other way:

        +>>> p l - xact_recheck_varids->elements
        $81 = -3388

I am able to repeat this issue. I'll look at it.

The second thing I wanted to ask about is a more strategical question. Does
anyone have clear understanding where this patch is going? The thread is quite
large, and it's hard to catch up with all the details -- it would be great if
someone could summarize the current state of things, are there any outstanding
design issues or not addressed concerns?

I hope I fixed the issues founded by Julian and Tomas. Now there is not implemented transaction support related to values, and I plan to implement this feature in the next stage.
It is waiting for review.
 

From the first look it seems some major topics the discussion is evolving are about:

* Validity of the use case. Seems to be quite convincingly addressed in [1] and
[2].

* Complicated logic around invalidation, concurrent create/drop etc. (I guess
the issue above is falling into the same category).

* Concerns that session variables could repeat some problems of temporary tables.

Why do you think so? The variable has no mvcc support - it is just stored value with local visibility without mvcc support. There can be little bit similar issues like with global temporary tables.

 

Is there anything else?

[1]: https://www.postgresql.org/message-id/CAFj8pRBT-bRQJBqkzon7tHcoFZ1byng06peZfZa0G72z46YFxg%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/flat/CAFj8pRBHSAHdainq5tRhN2Nns62h9-SZi0pvNq9DTe0VM6M1%3Dg%40mail.gmail.com#4faccb978d60e9b0b5d920e1a8a05bbf

Re: Schema variables - new implementation for Postgres 15 (typo)

From
Dmitry Dolgov
Date:
> On Thu, Dec 22, 2022 at 08:45:57PM +0100, Pavel Stehule wrote:
> > From the first look it seems some major topics the discussion is evolving
> > are about:
> >
> > * Validity of the use case. Seems to be quite convincingly addressed in
> > [1] and
> > [2].
> >
> > * Complicated logic around invalidation, concurrent create/drop etc. (I
> > guess
> > the issue above is falling into the same category).
> >
> > * Concerns that session variables could repeat some problems of temporary
> > tables.
> >
>
> Why do you think so? The variable has no mvcc support - it is just stored
> value with local visibility without mvcc support. There can be little bit
> similar issues like with global temporary tables.

Yeah, sorry for not being precise, I mean global temporary tables. This
is not my analysis, I've simply picked up it was mentioned a couple of
times here. The points above are not meant to serve as an objection
against the patch, but rather to figure out if there are any gaps left
to address and come up with some sort of plan with "committed" as a
final destination.



Re: Schema variables - new implementation for Postgres 15 (typo)

From
Pavel Stehule
Date:
Hi

čt 22. 12. 2022 v 22:23 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Thu, Dec 22, 2022 at 08:45:57PM +0100, Pavel Stehule wrote:
> > From the first look it seems some major topics the discussion is evolving
> > are about:
> >
> > * Validity of the use case. Seems to be quite convincingly addressed in
> > [1] and
> > [2].
> >
> > * Complicated logic around invalidation, concurrent create/drop etc. (I
> > guess
> > the issue above is falling into the same category).
> >
> > * Concerns that session variables could repeat some problems of temporary
> > tables.
> >

I am sending an updated patch, fixing the mentioned issue. Big thanks for testing, and checking.
 
>
> Why do you think so? The variable has no mvcc support - it is just stored
> value with local visibility without mvcc support. There can be little bit
> similar issues like with global temporary tables.

Yeah, sorry for not being precise, I mean global temporary tables. This
is not my analysis, I've simply picked up it was mentioned a couple of
times here. The points above are not meant to serve as an objection
against the patch, but rather to figure out if there are any gaps left
to address and come up with some sort of plan with "committed" as a
final destination.

There are some similarities, but there are a lot of differences too. Handling of metadata is partially similar, but session variable is almost the value cached in session memory. It has no statistics, it is not stored in a file. Because there is different storage, I don't think there is some intersection on implementation level.

Regards

Pavel

Attachment

Re: Schema variables - new implementation for Postgres 15 (typo)

From
Julien Rouhaud
Date:
Hi,

On Fri, Dec 23, 2022 at 08:38:43AM +0100, Pavel Stehule wrote:
>
> I am sending an updated patch, fixing the mentioned issue. Big thanks for
> testing, and checking.

There were multiple reviews in the last weeks which reported some issues, but
unless I'm missing something I don't see any follow up from the reviewers on
the changes?

I will still wait a bit to see if they chime in while I keep looking at the
diff since the last version of the code I reviewed.

But in the meantime I already saw a couple of things that don't look right:

--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -481,6 +481,11 @@ does_not_exist_skipping(ObjectType objtype, Node *object)
             msg = gettext_noop("publication \"%s\" does not exist, skipping");
             name = strVal(object);
             break;
+        case OBJECT_VARIABLE:
+            msg = gettext_noop("session variable \"%s\" does not exist, skipping");
+            name = NameListToString(castNode(List, object));
+            break;
+        default:

         case OBJECT_COLUMN:

the "default:" seems like a thinko during a rebase?

+Datum
+GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid expected_typid)
+{
+    SVariable    svar;
+
+    svar = prepare_variable_for_reading(varid);
+    Assert(svar != NULL && svar->is_valid);
+
+    return CopySessionVariableWithTypeCheck(varid, isNull, expected_typid);
+
+    if (expected_typid != svar->typid)
+        elog(ERROR, "type of variable \"%s.%s\" is different than expected",
+             get_namespace_name(get_session_variable_namespace(varid)),
+             get_session_variable_name(varid));
+
+    *isNull = svar->isnull;
+
+    return svar->value;
+}

there's a unconditional return in the middle of the function, which also looks
like a thinko during a rebase since CopySessionVariableWithTypeCheck mostly
contains the same following code.

I'm also wondering if there should be additional tests based on the last
scenario reported by Dmitry? (I don't see any new or similar test, but I may
have missed it).

> > > Why do you think so? The variable has no mvcc support - it is just stored
> > > value with local visibility without mvcc support. There can be little bit
> > > similar issues like with global temporary tables.
> >
> > Yeah, sorry for not being precise, I mean global temporary tables. This
> > is not my analysis, I've simply picked up it was mentioned a couple of
> > times here. The points above are not meant to serve as an objection
> > against the patch, but rather to figure out if there are any gaps left
> > to address and come up with some sort of plan with "committed" as a
> > final destination.
> >
>
> There are some similarities, but there are a lot of differences too.
> Handling of metadata is partially similar, but session variable is almost
> the value cached in session memory. It has no statistics, it is not stored
> in a file. Because there is different storage, I don't think there is some
> intersection on implementation level.

+1



Re: Schema variables - new implementation for Postgres 15 (typo)

From
Pavel Stehule
Date:


pá 6. 1. 2023 v 5:11 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Fri, Dec 23, 2022 at 08:38:43AM +0100, Pavel Stehule wrote:
>
> I am sending an updated patch, fixing the mentioned issue. Big thanks for
> testing, and checking.

There were multiple reviews in the last weeks which reported some issues, but
unless I'm missing something I don't see any follow up from the reviewers on
the changes?

I will still wait a bit to see if they chime in while I keep looking at the
diff since the last version of the code I reviewed.

But in the meantime I already saw a couple of things that don't look right:

--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -481,6 +481,11 @@ does_not_exist_skipping(ObjectType objtype, Node *object)
                        msg = gettext_noop("publication \"%s\" does not exist, skipping");
                        name = strVal(object);
                        break;
+               case OBJECT_VARIABLE:
+                       msg = gettext_noop("session variable \"%s\" does not exist, skipping");
+                       name = NameListToString(castNode(List, object));
+                       break;
+               default:

                case OBJECT_COLUMN:

the "default:" seems like a thinko during a rebase?

removed


 

+Datum
+GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid expected_typid)
+{
+       SVariable       svar;
+
+       svar = prepare_variable_for_reading(varid);
+       Assert(svar != NULL && svar->is_valid);
+
+       return CopySessionVariableWithTypeCheck(varid, isNull, expected_typid);
+
+       if (expected_typid != svar->typid)
+               elog(ERROR, "type of variable \"%s.%s\" is different than expected",
+                        get_namespace_name(get_session_variable_namespace(varid)),
+                        get_session_variable_name(varid));
+
+       *isNull = svar->isnull;
+
+       return svar->value;
+}

there's a unconditional return in the middle of the function, which also looks
like a thinko during a rebase since CopySessionVariableWithTypeCheck mostly
contains the same following code.

This looks like my mistake - originally I fixed an issue related to the usage session variable from plpgsql, and I forced a returned copy of the session variable's value.  Now, the function GetSessionVariableWithTypeCheck is not used everywhere. It can be used only from extensions, where is ensured so a) the value is not changed, b) and in a lifetime of returned value is not called any query or any expression that can change the value of this variable. I fixed this code and I enhanced comments. I am not sure if this function should not be removed. It is not used by backend, but it can be handy for extensions - it reduces possible useless copy.


I'm also wondering if there should be additional tests based on the last
scenario reported by Dmitry? (I don't see any new or similar test, but I may
have missed it).

The scenario reported by Dmitry is in tests already. I am not sure if I have to repeat it with active debug_discard_cache. I expect this mode will be activated in some testing environments.

When I checked regress tests, then debug_discard_caches is set only to zero (in one case).

I have no idea how to simply emulate this issue without debug_discard_caches on 1. It is necessary to raise the sinval message exactly when the variable is checked against system catalogue.

updated patches attached

Regards

Pavel

 

> > > Why do you think so? The variable has no mvcc support - it is just stored
> > > value with local visibility without mvcc support. There can be little bit
> > > similar issues like with global temporary tables.
> >
> > Yeah, sorry for not being precise, I mean global temporary tables. This
> > is not my analysis, I've simply picked up it was mentioned a couple of
> > times here. The points above are not meant to serve as an objection
> > against the patch, but rather to figure out if there are any gaps left
> > to address and come up with some sort of plan with "committed" as a
> > final destination.
> >
>
> There are some similarities, but there are a lot of differences too.
> Handling of metadata is partially similar, but session variable is almost
> the value cached in session memory. It has no statistics, it is not stored
> in a file. Because there is different storage, I don't think there is some
> intersection on implementation level.

+1
Attachment

Re: Schema variables - new implementation for Postgres 15 (typo)

From
Julien Rouhaud
Date:
Hi,

On Fri, Jan 06, 2023 at 08:02:41PM +0100, Pavel Stehule wrote:
> pá 6. 1. 2023 v 5:11 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>
> >
> > +Datum
> > +GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid
> > expected_typid)
> > +{
> > +       SVariable       svar;
> > +
> > +       svar = prepare_variable_for_reading(varid);
> > +       Assert(svar != NULL && svar->is_valid);
> > +
> > +       return CopySessionVariableWithTypeCheck(varid, isNull,
> > expected_typid);
> > +
> > +       if (expected_typid != svar->typid)
> > +               elog(ERROR, "type of variable \"%s.%s\" is different than
> > expected",
> > +
> > get_namespace_name(get_session_variable_namespace(varid)),
> > +                        get_session_variable_name(varid));
> > +
> > +       *isNull = svar->isnull;
> > +
> > +       return svar->value;
> > +}
> >
> > there's a unconditional return in the middle of the function, which also
> > looks
> > like a thinko during a rebase since CopySessionVariableWithTypeCheck mostly
> > contains the same following code.
> >
>
> This looks like my mistake - originally I fixed an issue related to the
> usage session variable from plpgsql, and I forced a returned copy of the
> session variable's value.  Now, the function
> GetSessionVariableWithTypeCheck is not used anyywhere.

Oh I didn't check if it was used in the patchset.

> It can be used only
> from extensions, where is ensured so a) the value is not changed, b) and in
> a lifetime of returned value is not called any query or any expression that
> can change the value of this variable. I fixed this code and I enhanced
> comments. I am not sure if this function should not be removed. It is not
> used by backend, but it can be handy for extensions - it reduces possible
> useless copy.

Hmm, how safe is it for third-party code to access the stored data directly
rather than a copy?  If it makes extension fragile if they're not careful
enough with cache invalidation, or even give them a way to mess up with the
data directly, it's probably not a good idea to provide such an API.


> > I'm also wondering if there should be additional tests based on the last
> > scenario reported by Dmitry? (I don't see any new or similar test, but I
> > may
> > have missed it).
> >
>
> The scenario reported by Dmitry is in tests already.

Oh, so I missed it sorry about that.  I did some testing using
debug_discard_cache in the past and didn't run into this issue, so it's
probably due to a more recent changes or before such a test was added.

> I am not sure if I
> have to repeat it with active debug_discard_cache. I expect this mode will
> be activated in some testing environments.

Yes, some buildfarm animal are configured to run with various
debug_discard_caches setting so it's not needed to override it in some tests
(it makes testing time really slow, which will be annoying for everyone
including old/slow buildfarm animals).

> I have no idea how to simply emulate this issue without
> debug_discard_caches on 1. It is necessary to raise the sinval message
> exactly when the variable is checked against system catalogue.

Manually testing while setting locally debug_discard_cache should be enough.

> updated patches attached

Thanks!



Re: Schema variables - new implementation for Postgres 15 (typo)

From
Pavel Stehule
Date:



> It can be used only
> from extensions, where is ensured so a) the value is not changed, b) and in
> a lifetime of returned value is not called any query or any expression that
> can change the value of this variable. I fixed this code and I enhanced
> comments. I am not sure if this function should not be removed. It is not
> used by backend, but it can be handy for extensions - it reduces possible
> useless copy.

Hmm, how safe is it for third-party code to access the stored data directly
rather than a copy?  If it makes extension fragile if they're not careful
enough with cache invalidation, or even give them a way to mess up with the
data directly, it's probably not a good idea to provide such an API.

ok, I removed it




 


> > I'm also wondering if there should be additional tests based on the last
> > scenario reported by Dmitry? (I don't see any new or similar test, but I
> > may
> > have missed it).
> >
>
> The scenario reported by Dmitry is in tests already.

Oh, so I missed it sorry about that.  I did some testing using
debug_discard_cache in the past and didn't run into this issue, so it's
probably due to a more recent changes or before such a test was added.

> I am not sure if I
> have to repeat it with active debug_discard_cache. I expect this mode will
> be activated in some testing environments.

Yes, some buildfarm animal are configured to run with various
debug_discard_caches setting so it's not needed to override it in some tests
(it makes testing time really slow, which will be annoying for everyone
including old/slow buildfarm animals).

> I have no idea how to simply emulate this issue without
> debug_discard_caches on 1. It is necessary to raise the sinval message
> exactly when the variable is checked against system catalogue.

Manually testing while setting locally debug_discard_cache should be enough.

> updated patches attached

Thanks!

I thank you

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15 (typo)

From
Julien Rouhaud
Date:
Hi,

On Sat, Jan 07, 2023 at 08:09:27AM +0100, Pavel Stehule wrote:
> >
> > Hmm, how safe is it for third-party code to access the stored data directly
> > rather than a copy?  If it makes extension fragile if they're not careful
> > enough with cache invalidation, or even give them a way to mess up with the
> > data directly, it's probably not a good idea to provide such an API.
> >
>
> ok, I removed it

Another new behavior I see is the new rowtype_only parameter for
LookupVariable.  Has this been discussed?

I can see how it can be annoying to get a "variable isn't composite" type of
error when you already know that only a composite object can be used (and other
might work), but it looks really scary to entirely ignore some objects that
should be found in your search_path just because of their datatype.

And if we ignore something like "a.b" if "a" isn't a variable of composite
type, why wouldn't we apply the same "just ignore it" rule if it's indeed a
composite type but doesn't have any "b" field?  Your application could also
start to use different object if your drop a say json variable and create a
composite variable instead.

It seems to be in contradiction with how the rest of the system works and looks
wrong to me.  Note also that LookupVariable can be quite expensive since you
may have to do a lookup for every schema found in the search_path, so the
sooner it stops the better.

> > > updated patches attached

I forgot to mention it last time but you should bump the copyright year for all
new files added when you'll send a new version of the patchset.



Re: Schema variables - new implementation for Postgres 15 (typo)

From
Pavel Stehule
Date:


út 10. 1. 2023 v 3:20 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Sat, Jan 07, 2023 at 08:09:27AM +0100, Pavel Stehule wrote:
> >
> > Hmm, how safe is it for third-party code to access the stored data directly
> > rather than a copy?  If it makes extension fragile if they're not careful
> > enough with cache invalidation, or even give them a way to mess up with the
> > data directly, it's probably not a good idea to provide such an API.
> >
>
> ok, I removed it

Another new behavior I see is the new rowtype_only parameter for
LookupVariable.  Has this been discussed?

I think so it was discussed about table shadowing

without this filter, I lost the message "missing FROM-clause entry for ..."

 -- should fail
 SELECT varx.xxx;
-ERROR:  missing FROM-clause entry for table "varx"
-LINE 1: SELECT varx.xxx;
-               ^
+ERROR:  type text is not composite
 -- don't allow multi column query
 CREATE TYPE vartesttp AS (a1 int, b1 int, c1 int);
 CREATE VARIABLE v1 AS vartesttp;
@@ -1421,9 +1419,7 @@
 DROP TYPE ab;
 CREATE VARIABLE myvar AS int;
 SELECT myvar.blabla;
-ERROR:  missing FROM-clause entry for table "myvar"
-LINE 1: SELECT myvar.blabla;
-               ^
+ERROR:  type integer is not composite
 DROP VARIABLE myvar;
 -- the result of view should be same in parallel mode too
 CREATE VARIABLE v1 AS int;

My original idea was to try to reduce possible conflicts (in old versions of this path, a conflict was disallowed). But it is true, so these "new" error messages are sensible too, and with eliminating rowtype_only I can reduce code.



I can see how it can be annoying to get a "variable isn't composite" type of
error when you already know that only a composite object can be used (and other
might work), but it looks really scary to entirely ignore some objects that
should be found in your search_path just because of their datatype.

And if we ignore something like "a.b" if "a" isn't a variable of composite
type, why wouldn't we apply the same "just ignore it" rule if it's indeed a
composite type but doesn't have any "b" field?  Your application could also
start to use different object if your drop a say json variable and create a
composite variable instead.

It seems to be in contradiction with how the rest of the system works and looks
wrong to me.  Note also that LookupVariable can be quite expensive since you
may have to do a lookup for every schema found in the search_path, so the
sooner it stops the better.

I removed this filter
 

> > > updated patches attached

I forgot to mention it last time but you should bump the copyright year for all
new files added when you'll send a new version of the patchset.

fixed

I modified the IdentifyVariable function a little bit. With new argument noerror I am able to ensure so no error will be raised when this function is called just for shadowing detection.

Regards

Pavel
 
Attachment

Re: Schema variables - new implementation for Postgres 15 (typo)

From
Julien Rouhaud
Date:
On Tue, Jan 10, 2023 at 08:35:16PM +0100, Pavel Stehule wrote:
> út 10. 1. 2023 v 3:20 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> >
> > Another new behavior I see is the new rowtype_only parameter for
> > LookupVariable.  Has this been discussed?
> >
>
> I think so it was discussed about table shadowing
>
> without this filter, I lost the message "missing FROM-clause entry for ..."
>
>  -- should fail
>  SELECT varx.xxx;
> -ERROR:  missing FROM-clause entry for table "varx"
> -LINE 1: SELECT varx.xxx;
> -               ^
> +ERROR:  type text is not composite
>  -- don't allow multi column query
>  CREATE TYPE vartesttp AS (a1 int, b1 int, c1 int);
>  CREATE VARIABLE v1 AS vartesttp;
> @@ -1421,9 +1419,7 @@
>  DROP TYPE ab;
>  CREATE VARIABLE myvar AS int;
>  SELECT myvar.blabla;
> -ERROR:  missing FROM-clause entry for table "myvar"
> -LINE 1: SELECT myvar.blabla;
> -               ^
> +ERROR:  type integer is not composite
>  DROP VARIABLE myvar;
>  -- the result of view should be same in parallel mode too
>  CREATE VARIABLE v1 AS int;
>
> My original idea was to try to reduce possible conflicts (in old versions
> of this path, a conflict was disallowed). But it is true, so these "new"
> error messages are sensible too, and with eliminating rowtype_only I can
> reduce code.

Ok!  Another problem is that the error message as-is is highly unhelpful as
it's not clear at all that the problem is coming from an unsuitable variable.
Maybe change makeParamSessionVariable to use lookup_rowtype_tupdesc_noerror()
and emit a friendlier error message?  Something like

variable "X.Y" is of type Z, which is not composite

> I modified the IdentifyVariable function a little bit. With new argument
> noerror I am able to ensure so no error will be raised when this function
> is called just for shadowing detection.

I locally modified IdentifyVariable to emit WARNING reports when noerror is set
to quickly see how it was used and didn't get any regression test error.  This
definitely needs to be covered by regression tests.  Looking as
session_variables.sql, the session_variables_ambiguity_warning GUC doesn't have
a lot of tests in general.



Re: Schema variables - new implementation for Postgres 15 (typo)

From
Pavel Stehule
Date:


st 11. 1. 2023 v 10:08 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Tue, Jan 10, 2023 at 08:35:16PM +0100, Pavel Stehule wrote:
> út 10. 1. 2023 v 3:20 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> >
> > Another new behavior I see is the new rowtype_only parameter for
> > LookupVariable.  Has this been discussed?
> >
>
> I think so it was discussed about table shadowing
>
> without this filter, I lost the message "missing FROM-clause entry for ..."
>
>  -- should fail
>  SELECT varx.xxx;
> -ERROR:  missing FROM-clause entry for table "varx"
> -LINE 1: SELECT varx.xxx;
> -               ^
> +ERROR:  type text is not composite
>  -- don't allow multi column query
>  CREATE TYPE vartesttp AS (a1 int, b1 int, c1 int);
>  CREATE VARIABLE v1 AS vartesttp;
> @@ -1421,9 +1419,7 @@
>  DROP TYPE ab;
>  CREATE VARIABLE myvar AS int;
>  SELECT myvar.blabla;
> -ERROR:  missing FROM-clause entry for table "myvar"
> -LINE 1: SELECT myvar.blabla;
> -               ^
> +ERROR:  type integer is not composite
>  DROP VARIABLE myvar;
>  -- the result of view should be same in parallel mode too
>  CREATE VARIABLE v1 AS int;
>
> My original idea was to try to reduce possible conflicts (in old versions
> of this path, a conflict was disallowed). But it is true, so these "new"
> error messages are sensible too, and with eliminating rowtype_only I can
> reduce code.

Ok!  Another problem is that the error message as-is is highly unhelpful as
it's not clear at all that the problem is coming from an unsuitable variable.
Maybe change makeParamSessionVariable to use lookup_rowtype_tupdesc_noerror()
and emit a friendlier error message?  Something like

variable "X.Y" is of type Z, which is not composite

done
 

> I modified the IdentifyVariable function a little bit. With new argument
> noerror I am able to ensure so no error will be raised when this function
> is called just for shadowing detection.

I locally modified IdentifyVariable to emit WARNING reports when noerror is set
to quickly see how it was used and didn't get any regression test error.  This
definitely needs to be covered by regression tests.  Looking as
session_variables.sql, the session_variables_ambiguity_warning GUC doesn't have
a lot of tests in general.

I enhanced regress tests about this scenario

Regards

Pavel

 
Attachment

Re: Schema variables - new implementation for Postgres 15 (typo)

From
Dmitry Dolgov
Date:
I've accumulated another collection of various questions and comments. As a
side note I'm getting a good feeling about this patch, those part I've read so
far looks good to me.

* I've suddenly realized one could use pseudo types for variables, and
  it not always works. E.g.:

    =# create variable pseudo_array anyarray;
    =# select pseudo_array;
     pseudo_array
    --------------
     NULL

    =# let pseudo_array = ARRAY[1, 2, 3];
    ERROR:  42804: target session variable is of type anyarray but expression is of type integer[]
    LOCATION:  svariableStartupReceiver, svariableReceiver.c:79

    =# create variable pseudo_unknown unknown;
    =# select pseudo_unknown;
    ERROR:  XX000: failed to find conversion function from unknown to text
    LOCATION:  coerce_type, parse_coerce.c:542

  Is it supposed to be like this, or something is missing?

* I think it was already mentioned in the thread, there seems to be not a
  single usage of CHECK_FOR_INTERRUPTS in session_variable.c . But some number
  of loops over the sessionvars are implemented, are they always going to be
  small enough to not make any troubles?

* sync_sessionvars_all explains why is it necessary to copy xact_recheck_varids:

         When we check the variables, the system cache can be invalidated,
         and xact_recheck_varids can be enhanced.

  I'm not quite following what the "enhancement" part is about? Is
  xact_recheck_varids could be somehow updated concurrently with the loop?

* A small typo

    diff --git a/src/backend/commands/session_variable.c b/src/backend/commands/session_variable.c
    --- a/src/backend/commands/session_variable.c
    +++ b/src/backend/commands/session_variable.c
    @@ -485,7 +485,7 @@ sync_sessionvars_all(bool filter_lxid)

            /*
             * When we check the variables, the system cache can be invalidated,
    -        * and xac_recheck_varids can be enhanced. We want to iterate
    +        * and xact_recheck_varids can be enhanced. We want to iterate

NOTE: The commentaries above were made based on the previous patch version, but
it looks like those aspects were not changed.



Re: Schema variables - new implementation for Postgres 15 (typo)

From
Pavel Stehule
Date:


pá 20. 1. 2023 v 21:35 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
I've accumulated another collection of various questions and comments. As a
side note I'm getting a good feeling about this patch, those part I've read so
far looks good to me.

* I've suddenly realized one could use pseudo types for variables, and
  it not always works. E.g.:

    =# create variable pseudo_array anyarray;
    =# select pseudo_array;
     pseudo_array
    --------------
     NULL

    =# let pseudo_array = ARRAY[1, 2, 3];
    ERROR:  42804: target session variable is of type anyarray but expression is of type integer[]
    LOCATION:  svariableStartupReceiver, svariableReceiver.c:79

    =# create variable pseudo_unknown unknown;
    =# select pseudo_unknown;
    ERROR:  XX000: failed to find conversion function from unknown to text
    LOCATION:  coerce_type, parse_coerce.c:542

  Is it supposed to be like this, or something is missing?

it is my oversight - it should be disallowed

done

 

* I think it was already mentioned in the thread, there seems to be not a
  single usage of CHECK_FOR_INTERRUPTS in session_variable.c . But some number
  of loops over the sessionvars are implemented, are they always going to be
  small enough to not make any troubles?

The longest cycle is a cycle that rechecks actively used variables against system catalog. No cycle depends on the content of variables.
 

* sync_sessionvars_all explains why is it necessary to copy xact_recheck_varids:

                 When we check the variables, the system cache can be invalidated,
                 and xact_recheck_varids can be enhanced.

  I'm not quite following what the "enhancement" part is about? Is
  xact_recheck_varids could be somehow updated concurrently with the loop?

Yes. pg_variable_cache_callback can be called when is_session_variable_valid is executed.

Maybe "extended" can be a better word instead of "enhanced"? I reformulated this comment

 

* A small typo

        diff --git a/src/backend/commands/session_variable.c b/src/backend/commands/session_variable.c
        --- a/src/backend/commands/session_variable.c
        +++ b/src/backend/commands/session_variable.c
        @@ -485,7 +485,7 @@ sync_sessionvars_all(bool filter_lxid)

                        /*
                         * When we check the variables, the system cache can be invalidated,
        -         * and xac_recheck_varids can be enhanced. We want to iterate
        +        * and xact_recheck_varids can be enhanced. We want to iterate


fixed
 
NOTE: The commentaries above were made based on the previous patch version, but
it looks like those aspects were not changed.

Thank you for comments, updated rebased patch assigned

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15 (typo)

From
Dmitry Dolgov
Date:
> On Sun, Jan 22, 2023 at 07:47:07PM +0100, Pavel Stehule wrote:
> pá 20. 1. 2023 v 21:35 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> napsal:
>
> > * I think it was already mentioned in the thread, there seems to be not a
> >   single usage of CHECK_FOR_INTERRUPTS in session_variable.c . But some
> > number
> >   of loops over the sessionvars are implemented, are they always going to
> > be
> >   small enough to not make any troubles?
> >
>
> The longest cycle is a cycle that rechecks actively used variables against
> system catalog. No cycle depends on the content of variables.

Right, but what about the cases with huge number of variables? Not
saying it could be in any sense common, but possible to do.

> > * sync_sessionvars_all explains why is it necessary to copy
> > xact_recheck_varids:
> >
> >                  When we check the variables, the system cache can be
> > invalidated,
> >                  and xact_recheck_varids can be enhanced.
> >
> >   I'm not quite following what the "enhancement" part is about? Is
> >   xact_recheck_varids could be somehow updated concurrently with the loop?
> >
>
> Yes. pg_variable_cache_callback can be called when
> is_session_variable_valid is executed.
>
> Maybe "extended" can be a better word instead of "enhanced"? I reformulated
> this comment

Yeah, "extended" sounds better. But I was mostly confused about the
mechanism, if the cache callback can interrupt the execution at any
moment when called, that would explain it.



Re: Schema variables - new implementation for Postgres 15 (typo)

From
Pavel Stehule
Date:


po 23. 1. 2023 v 15:25 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Sun, Jan 22, 2023 at 07:47:07PM +0100, Pavel Stehule wrote:
> pá 20. 1. 2023 v 21:35 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> napsal:
>
> > * I think it was already mentioned in the thread, there seems to be not a
> >   single usage of CHECK_FOR_INTERRUPTS in session_variable.c . But some
> > number
> >   of loops over the sessionvars are implemented, are they always going to
> > be
> >   small enough to not make any troubles?
> >
>
> The longest cycle is a cycle that rechecks actively used variables against
> system catalog. No cycle depends on the content of variables.

Right, but what about the cases with huge number of variables? Not
saying it could be in any sense common, but possible to do.

Now I tested 10K variables (with enabled assertions,  without it is should be faster)

creating  763ms

do $$ begin for i in 1..10000 loop execute format('create variable %I as int', 'xx' || i); end loop; end $$;

assigning 491ms

do $$ begin for i in 1..10000 loop execute format('let %I = 10', 'xx' || i); end loop; end $$;

dropping without necessity of memory cleaning 1155ms

do $$ begin for i in 1..10000 loop execute format('drop variable %I', 'xx' || i); end loop; end $$;

dropping with memory cleaning 2708

just memory cleaning 72ms (time of commit - at commit cleaning)

do $$ begin for i in 1..10000 loop execute format('let %I = 10', 'xx' || i); end loop; end $$;
begin;
do $$ begin for i in 1..10000 loop execute format('drop variable %I', 'xx' || i); end loop; end $$;
commit;

So just syncing (cleaning 10K) variables needs less than 72 ms

I can be wrong, but from these numbers I don't think so these sync cycles should to contain CHECK_FOR_INTERRUPTS

What do you think?



> > * sync_sessionvars_all explains why is it necessary to copy
> > xact_recheck_varids:
> >
> >                  When we check the variables, the system cache can be
> > invalidated,
> >                  and xact_recheck_varids can be enhanced.
> >
> >   I'm not quite following what the "enhancement" part is about? Is
> >   xact_recheck_varids could be somehow updated concurrently with the loop?
> >
>
> Yes. pg_variable_cache_callback can be called when
> is_session_variable_valid is executed.
>
> Maybe "extended" can be a better word instead of "enhanced"? I reformulated
> this comment

Yeah, "extended" sounds better. But I was mostly confused about the
mechanism, if the cache callback can interrupt the execution at any
moment when called, that would explain it.

patch from yesterday has extended comment in this area :-)
 
Regards

Pavel

Re: Schema variables - new implementation for Postgres 15 (typo)

From
Dmitry Dolgov
Date:
> On Mon, Jan 23, 2023 at 07:09:27PM +0100, Pavel Stehule wrote:
> po 23. 1. 2023 v 15:25 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> napsal:
>
> > > On Sun, Jan 22, 2023 at 07:47:07PM +0100, Pavel Stehule wrote:
> > > pá 20. 1. 2023 v 21:35 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> > > napsal:
> > >
> > > > * I think it was already mentioned in the thread, there seems to be
> > not a
> > > >   single usage of CHECK_FOR_INTERRUPTS in session_variable.c . But some
> > > > number
> > > >   of loops over the sessionvars are implemented, are they always going
> > to
> > > > be
> > > >   small enough to not make any troubles?
> > > >
> > >
> > > The longest cycle is a cycle that rechecks actively used variables
> > against
> > > system catalog. No cycle depends on the content of variables.
> >
> > Right, but what about the cases with huge number of variables? Not
> > saying it could be in any sense common, but possible to do.
> >
>
> Now I tested 10K variables (with enabled assertions,  without it is should
> be faster)
>
> [...]
>
> I can be wrong, but from these numbers I don't think so these sync cycles
> should to contain CHECK_FOR_INTERRUPTS
>
> What do you think?

Well, there is always possibility someone will create more variables
than any arbitrary limit we have tested for. But I see your point and
don't have a strong opinion about this, so let's keep it as it is :)

> > > > * sync_sessionvars_all explains why is it necessary to copy
> > > > xact_recheck_varids:
> > > >
> > > >                  When we check the variables, the system cache can be
> > > > invalidated,
> > > >                  and xact_recheck_varids can be enhanced.
> > > >
> > > >   I'm not quite following what the "enhancement" part is about? Is
> > > >   xact_recheck_varids could be somehow updated concurrently with the
> > loop?
> > > >
> > >
> > > Yes. pg_variable_cache_callback can be called when
> > > is_session_variable_valid is executed.
> > >
> > > Maybe "extended" can be a better word instead of "enhanced"? I
> > reformulated
> > > this comment
> >
> > Yeah, "extended" sounds better. But I was mostly confused about the
> > mechanism, if the cache callback can interrupt the execution at any
> > moment when called, that would explain it.
> >
>
> patch from yesterday has extended comment in this area :-)

Thanks!



Re: Schema variables - new implementation for Postgres 15 (typo)

From
Pavel Stehule
Date:



>
> I can be wrong, but from these numbers I don't think so these sync cycles
> should to contain CHECK_FOR_INTERRUPTS
>
> What do you think?

Well, there is always possibility someone will create more variables
than any arbitrary limit we have tested for. But I see your point and
don't have a strong opinion about this, so let's keep it as it is :)


In this case, I afraid more about possible impacts of canceling than long operation.

It should be possible to cancel query - but you cannot to cancel followup operation like memory cleaning or other resource releasing.

The possibility to be cancelled in this cycle means rewriting  processing to be much more defensive (and slower). And although you can hypothetically cancel sync cycles, then you should to some time finish these cycles because you need to clean memory from garbage.

Regards

Pavel







ok :)

If it is an issue, then it can be easily fixed at future, but I don't think

I


Re: Schema variables - new implementation for Postgres 15 (typo)

From
Pavel Stehule
Date:
Hi

I read notes from the FOSDEM developer meeting, and I would like to repeat notice about motivation for introduction of session variables, and one reason why session_variables are not transactional, and why they should not be replaced by temp tables is performance.

There are more use cases where session variables can be used. One scenario for session variables is to use them like static variables. They can be used from some rows triggers, .. where local variable is not enough

create variable xx as int;

do $$                        
begin
  let xx = 1;
  for i in 1..10000 loop
    let xx = xx + 1;
  end loop;
  raise notice '%', xx;
end;
$$;
NOTICE:  10001
DO
Time: 4,079 ms

create temp table xx01(a int);
delete from xx01; vacuum full xx01; vacuum;

do $$                                  
begin
  insert into xx01 values(1);
  for i in 1..10000 loop
    update xx01 set a = a + 1;
  end loop;
  raise notice '%', (select a from xx01);
end;
$$;
NOTICE:  10001
DO
Time: 1678,949 ms (00:01,679)

postgres=# \dt+ xx01
                                    List of relations
┌───────────┬──────┬───────┬───────┬─────────────┬───────────────┬────────┬─────────────┐
│  Schema   │ Name │ Type  │ Owner │ Persistence │ Access method │  Size  │ Description │
╞═══════════╪══════╪═══════╪═══════╪═════════════╪═══════════════╪════════╪═════════════╡
│ pg_temp_3 │ xx01 │ table │ pavel │ temporary   │ heap          │ 384 kB │             │
└───────────┴──────┴───────┴───────┴─────────────┴───────────────┴────────┴─────────────┘
(1 row)

Originally, I tested 100K iterations, but it was too slow, and I cancelled it after 5 minutes. Vacuum can be done after the end of transaction.

And there can be another negative impact related to bloating of pg_attribute, pg_class, pg_depend tables.

Workaround based on custom GUC is not too bad, but there is not any possibility of  security protection (and there is not any possibility of static check in plpgsql_check) - and still it is 20x slower than session variables

do $$
begin
  perform set_config('cust.xx', '1', false);
  for i in 1..10000 loop
    perform set_config('cust.xx', (current_setting('cust.xx')::int + 1)::text, true);
  end loop;
  raise notice '%', current_setting('cust.xx');
end;
$$;
NOTICE:  10001
DO
Time: 80,201 ms

Session variables don't try to replace temp tables, and temp tables can be a very bad replacement of session's variables.

Regards

Pavel





Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Tue, Feb 28, 2023 at 06:12:50AM +0100, Pavel Stehule wrote:
>
> fresh rebase

I'm continuing to review, this time going through shadowing stuff in
transformColumnRef, IdentifyVariable etc. Well, that's a lot of leg work
for rather little outcome :) I guess all attempts to simplify this part
weren't successful?

Couple of questions to it. In IdentifyVariable in the branch handling
two values the commentary says:

    /*
     * a.b can mean "schema"."variable" or "variable"."field",
     * Check both variants, and returns InvalidOid with
     * not_unique flag, when both interpretations are
     * possible. Second node can be star. In this case, the
     * only allowed possibility is "variable"."*".
     */

I read this as "variable"."*" is a valid combination, but the very next
part of this condition says differently:

    /*
     * Session variables doesn't support unboxing by star
     * syntax. But this syntax have to be calculated here,
     * because can come from non session variables related
     * expressions.
     */
    Assert(IsA(field2, A_Star));

Is the first commentary not quite correct?

Another question about how shadowing warning should work between namespaces.
Let's say I've got two namespaces, public and test, both have a session
variable with the same name, but only one has a table with the same name:

    -- in public
    create table test_agg(a int);
    create type for_test_agg as (a int);
    create variable test_agg for_test_agg;

    -- in test
    create type for_test_agg as (a int);
    create variable test_agg for_test_agg;

Now if we will try to trigger the shadowing warning from public
namespace, it would work differently:

    -- in public
    =# let test.test_agg.a = 10;
    =# let test_agg.a = 20;
    =# set session_variables_ambiguity_warning to on;

    -- note the value returned from the table
    =# select jsonb_agg(test_agg.a) from test_agg;
    WARNING:  42702: session variable "test_agg.a" is shadowed
    LINE 1: select jsonb_agg(test_agg.a) from test_agg;
                             ^
    DETAIL:  Session variables can be shadowed by columns, routine's variables and routine's arguments with the same
name.
    LOCATION:  transformColumnRef, parse_expr.c:940
     jsonb_agg
    -----------
     [1]

    -- no warning, note the session variable value
    =# select jsonb_agg(test.test_agg.a) from test_agg;
     jsonb_agg
    -----------
     [10]

It happens because in the second scenario the logic inside transformColumnRef
will not set up the node variable (there is no corresponding table in the
"test" schema), and the following conditions covering session variables
shadowing are depending on it. Is it supposed to be like this?



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 3. 3. 2023 v 21:19 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Tue, Feb 28, 2023 at 06:12:50AM +0100, Pavel Stehule wrote:
>
> fresh rebase

I'm continuing to review, this time going through shadowing stuff in
transformColumnRef, IdentifyVariable etc. Well, that's a lot of leg work
for rather little outcome :) I guess all attempts to simplify this part
weren't successful?

Originally I wrote it in the strategy "reduce false alarms". But when I think about it, it may not be good in this case. Usually the changes are done first on some developer environment, and good practice is to disallow same (possibly confusing) identifiers. So I am not against making this warning more aggressive with some possibility of false alarms.  With blocking reduction of alarms the differences in regress was zero. So I reduced this part.

 

Couple of questions to it. In IdentifyVariable in the branch handling
two values the commentary says:

    /*
     * a.b can mean "schema"."variable" or "variable"."field",
     * Check both variants, and returns InvalidOid with
     * not_unique flag, when both interpretations are
     * possible. Second node can be star. In this case, the
     * only allowed possibility is "variable"."*".
     */

I read this as "variable"."*" is a valid combination, but the very next
part of this condition says differently:

 

    /*
     * Session variables doesn't support unboxing by star
     * syntax. But this syntax have to be calculated here,
     * because can come from non session variables related
     * expressions.
     */
    Assert(IsA(field2, A_Star));

Is the first commentary not quite correct?

I think it is correct, but maybe I was not good at describing this issue. The sentence "Second node can be a star. In this case, the
the only allowed possibility is "variable"."*"." should be in the next comment.

In this part we process a list of identifiers, and we try to map these identifiers to some semantics. The parser should ensure that
all fields of lists are strings or the last field is a star. In this case the semantic "schema".* is nonsense, and the only possible semantic
is "variable".*. It is valid semantics, but unsupported now. Unboxing is available by syntax (var).*

I changed the comment

 

Another question about how shadowing warning should work between namespaces.
Let's say I've got two namespaces, public and test, both have a session
variable with the same name, but only one has a table with the same name:

    -- in public
    create table test_agg(a int);
    create type for_test_agg as (a int);
    create variable test_agg for_test_agg;

    -- in test
    create type for_test_agg as (a int);
    create variable test_agg for_test_agg;

Now if we will try to trigger the shadowing warning from public
namespace, it would work differently:

    -- in public
    =# let test.test_agg.a = 10;
    =# let test_agg.a = 20;
    =# set session_variables_ambiguity_warning to on;

        -- note the value returned from the table
        =# select jsonb_agg(test_agg.a) from test_agg;
        WARNING:  42702: session variable "test_agg.a" is shadowed
        LINE 1: select jsonb_agg(test_agg.a) from test_agg;
                                                         ^
        DETAIL:  Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name.
        LOCATION:  transformColumnRef, parse_expr.c:940
         jsonb_agg
        -----------
         [1]

        -- no warning, note the session variable value
        =# select jsonb_agg(test.test_agg.a) from test_agg;
         jsonb_agg
        -----------
         [10]

It happens because in the second scenario the logic inside transformColumnRef
will not set up the node variable (there is no corresponding table in the
"test" schema), and the following conditions covering session variables
shadowing are depending on it. Is it supposed to be like this?

I am sorry, I don't understand what you want to describe. Session variables are shadowed by relations, ever. It is design. In the first case, the variable is shadowed and a warning is raised. In the second case, "test"."test_agg"."a" is a fully unique qualified identifier, and then the variable is used, and then it is not shadowed.

updated patches attached

Regards

Pavel

 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Wed, Mar 08, 2023 at 08:31:07AM +0100, Pavel Stehule wrote:
> pá 3. 3. 2023 v 21:19 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> napsal:
>
> > > On Tue, Feb 28, 2023 at 06:12:50AM +0100, Pavel Stehule wrote:
> > >
> > > fresh rebase
> >
> > I'm continuing to review, this time going through shadowing stuff in
> > transformColumnRef, IdentifyVariable etc. Well, that's a lot of leg work
> > for rather little outcome :) I guess all attempts to simplify this part
> > weren't successful?
> >
>
> Originally I wrote it in the strategy "reduce false alarms". But when I
> think about it, it may not be good in this case. Usually the changes are
> done first on some developer environment, and good practice is to disallow
> same (possibly confusing) identifiers. So I am not against making this
> warning more aggressive with some possibility of false alarms.  With
> blocking reduction of alarms the differences in regress was zero. So I
> reduced this part.

Great, thank you.

> > Couple of questions to it. In IdentifyVariable in the branch handling
> > two values the commentary says:
> >
> >     /*
> >      * a.b can mean "schema"."variable" or "variable"."field",
> >      * Check both variants, and returns InvalidOid with
> >      * not_unique flag, when both interpretations are
> >      * possible. Second node can be star. In this case, the
> >      * only allowed possibility is "variable"."*".
> >      */
> >
> > I read this as "variable"."*" is a valid combination, but the very next
> > part of this condition says differently:
> >
>
>
>
> >
> >     /*
> >      * Session variables doesn't support unboxing by star
> >      * syntax. But this syntax have to be calculated here,
> >      * because can come from non session variables related
> >      * expressions.
> >      */
> >     Assert(IsA(field2, A_Star));
> >
> > Is the first commentary not quite correct?
> >
>
> I think it is correct, but maybe I was not good at describing this issue.
> The sentence "Second node can be a star. In this case, the
> the only allowed possibility is "variable"."*"." should be in the next
> comment.
>
> In this part we process a list of identifiers, and we try to map these
> identifiers to some semantics. The parser should ensure that
> all fields of lists are strings or the last field is a star. In this case
> the semantic "schema".* is nonsense, and the only possible semantic
> is "variable".*. It is valid semantics, but unsupported now. Unboxing is
> available by syntax (var).*
>
> I changed the comment

Thanks. Just to clarify, by "unsupported" you mean unsupported in the
current patch implementation right? From what I understand value
unboxing could be done without parentheses in a non-top level select
query.

As a side note, I'm not sure if this branch is exercised in any tests.
I've replaced returning InvalidOid with actually doing LookupVariable(NULL, a, true)
in this case, and all the tests are still passing.

> > Another question about how shadowing warning should work between
> > namespaces.
> > Let's say I've got two namespaces, public and test, both have a session
> > variable with the same name, but only one has a table with the same name:
> >
> >     -- in public
> >     create table test_agg(a int);
> >     create type for_test_agg as (a int);
> >     create variable test_agg for_test_agg;
> >
> >     -- in test
> >     create type for_test_agg as (a int);
> >     create variable test_agg for_test_agg;
> >
> > Now if we will try to trigger the shadowing warning from public
> > namespace, it would work differently:
> >
> >     -- in public
> >     =# let test.test_agg.a = 10;
> >     =# let test_agg.a = 20;
> >     =# set session_variables_ambiguity_warning to on;
> >
> >         -- note the value returned from the table
> >         =# select jsonb_agg(test_agg.a) from test_agg;
> >         WARNING:  42702: session variable "test_agg.a" is shadowed
> >         LINE 1: select jsonb_agg(test_agg.a) from test_agg;
> >                                                          ^
> >         DETAIL:  Session variables can be shadowed by columns, routine's
> > variables and routine's arguments with the same name.
> >         LOCATION:  transformColumnRef, parse_expr.c:940
> >          jsonb_agg
> >         -----------
> >          [1]
> >
> >         -- no warning, note the session variable value
> >         =# select jsonb_agg(test.test_agg.a) from test_agg;
> >          jsonb_agg
> >         -----------
> >          [10]
> >
> > It happens because in the second scenario the logic inside
> > transformColumnRef
> > will not set up the node variable (there is no corresponding table in the
> > "test" schema), and the following conditions covering session variables
> > shadowing are depending on it. Is it supposed to be like this?
> >
>
> I am sorry, I don't understand what you want to describe. Session variables
> are shadowed by relations, ever. It is design. In the first case, the
> variable is shadowed and a warning is raised. In the second case,
> "test"."test_agg"."a" is a fully unique qualified identifier, and then the
> variable is used, and then it is not shadowed.

Yeah, there was a misunderstanding on my side, sorry. For whatever
reason I thought shadowing between schemas is a reasonable thing, but as
you pointed out it doesn't really make sense.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 8. 3. 2023 v 16:35 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Wed, Mar 08, 2023 at 08:31:07AM +0100, Pavel Stehule wrote:
> pá 3. 3. 2023 v 21:19 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> napsal:
>
> > > On Tue, Feb 28, 2023 at 06:12:50AM +0100, Pavel Stehule wrote:
> > >
> > > fresh rebase
> >
> > I'm continuing to review, this time going through shadowing stuff in
> > transformColumnRef, IdentifyVariable etc. Well, that's a lot of leg work
> > for rather little outcome :) I guess all attempts to simplify this part
> > weren't successful?
> >
>
> Originally I wrote it in the strategy "reduce false alarms". But when I
> think about it, it may not be good in this case. Usually the changes are
> done first on some developer environment, and good practice is to disallow
> same (possibly confusing) identifiers. So I am not against making this
> warning more aggressive with some possibility of false alarms.  With
> blocking reduction of alarms the differences in regress was zero. So I
> reduced this part.

Great, thank you.

> > Couple of questions to it. In IdentifyVariable in the branch handling
> > two values the commentary says:
> >
> >     /*
> >      * a.b can mean "schema"."variable" or "variable"."field",
> >      * Check both variants, and returns InvalidOid with
> >      * not_unique flag, when both interpretations are
> >      * possible. Second node can be star. In this case, the
> >      * only allowed possibility is "variable"."*".
> >      */
> >
> > I read this as "variable"."*" is a valid combination, but the very next
> > part of this condition says differently:
> >
>
>
>
> >
> >     /*
> >      * Session variables doesn't support unboxing by star
> >      * syntax. But this syntax have to be calculated here,
> >      * because can come from non session variables related
> >      * expressions.
> >      */
> >     Assert(IsA(field2, A_Star));
> >
> > Is the first commentary not quite correct?
> >
>
> I think it is correct, but maybe I was not good at describing this issue.
> The sentence "Second node can be a star. In this case, the
> the only allowed possibility is "variable"."*"." should be in the next
> comment.
>
> In this part we process a list of identifiers, and we try to map these
> identifiers to some semantics. The parser should ensure that
> all fields of lists are strings or the last field is a star. In this case
> the semantic "schema".* is nonsense, and the only possible semantic
> is "variable".*. It is valid semantics, but unsupported now. Unboxing is
> available by syntax (var).*
>
> I changed the comment

Thanks. Just to clarify, by "unsupported" you mean unsupported in the
current patch implementation right? From what I understand value
unboxing could be done without parentheses in a non-top level select
query.

Yes, it can be implemented in the next steps. I don't think there can be some issues, but it means more lines and a little bit more complex interface.
In this step, I try to implement minimalistic required functionality that can be enhanced in next steps. For this area is an important fact, so session variables
will be shadowed always by relations. It means new functionality in session variables cannot break existing applications ever, and then there is space
for future enhancement.
 

As a side note, I'm not sure if this branch is exercised in any tests.
I've replaced returning InvalidOid with actually doing LookupVariable(NULL, a, true)
in this case, and all the tests are still passing.

Usually we don't test not yet implemented functionality.
 

> > Another question about how shadowing warning should work between
> > namespaces.
> > Let's say I've got two namespaces, public and test, both have a session
> > variable with the same name, but only one has a table with the same name:
> >
> >     -- in public
> >     create table test_agg(a int);
> >     create type for_test_agg as (a int);
> >     create variable test_agg for_test_agg;
> >
> >     -- in test
> >     create type for_test_agg as (a int);
> >     create variable test_agg for_test_agg;
> >
> > Now if we will try to trigger the shadowing warning from public
> > namespace, it would work differently:
> >
> >     -- in public
> >     =# let test.test_agg.a = 10;
> >     =# let test_agg.a = 20;
> >     =# set session_variables_ambiguity_warning to on;
> >
> >         -- note the value returned from the table
> >         =# select jsonb_agg(test_agg.a) from test_agg;
> >         WARNING:  42702: session variable "test_agg.a" is shadowed
> >         LINE 1: select jsonb_agg(test_agg.a) from test_agg;
> >                                                          ^
> >         DETAIL:  Session variables can be shadowed by columns, routine's
> > variables and routine's arguments with the same name.
> >         LOCATION:  transformColumnRef, parse_expr.c:940
> >          jsonb_agg
> >         -----------
> >          [1]
> >
> >         -- no warning, note the session variable value
> >         =# select jsonb_agg(test.test_agg.a) from test_agg;
> >          jsonb_agg
> >         -----------
> >          [10]
> >
> > It happens because in the second scenario the logic inside
> > transformColumnRef
> > will not set up the node variable (there is no corresponding table in the
> > "test" schema), and the following conditions covering session variables
> > shadowing are depending on it. Is it supposed to be like this?
> >
>
> I am sorry, I don't understand what you want to describe. Session variables
> are shadowed by relations, ever. It is design. In the first case, the
> variable is shadowed and a warning is raised. In the second case,
> "test"."test_agg"."a" is a fully unique qualified identifier, and then the
> variable is used, and then it is not shadowed.

Yeah, there was a misunderstanding on my side, sorry. For whatever
reason I thought shadowing between schemas is a reasonable thing, but as
you pointed out it doesn't really make sense.

yes. Thinking about this question is not trivial. There are more dimensions - like search path setting, catalog name, possible three fields identifier, possible collisions between variable and variable, and between variable and relation. But current design can work I think. Still it is strong enough, and it is simplified against start design.


 
Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Peter Eisentraut
Date:
On 17.03.23 21:50, Pavel Stehule wrote:
> rebase + fix-update pg_dump tests

I have a few comments on the code:

0001

ExecGrant_Variable() could probably use ExecGrant_common().

The additions to syscache.c should be formatted to the new style.

in pg_variable.h:

- create_lsn ought to have a "var" prefix.

- typo: "typmode for variable's type"

- What is the purpose of struct Variable?  It seems very similar to
   FormData_pg_variable.  At least a comment would be useful.

Preserve the trailing comma in ParseExprKind.


0002

expr_kind_allows_session_variables() should have some explanation
about criteria for determining which expression kinds should allow
variables.

Usually, we handle EXPR_KIND_* switches without default case, so we
get notified what needs to be changed if a new enum symbol is added.


0010

The material from the tutorial (advanced.sgml) might be better in
ddl.sgml.

In catalogs.sgml, the columns don't match the ones actually defined in
pg_variable.h in patch 0001 (e.g., create_lsn is missing and the order
doesn't match).

(The order of columns in pg_variable.h didn't immediately make sense to 
me either, so maybe there is a middle ground to be found.)

session_variables_ambiguity_warning: There needs to be more
information about this.  The current explanation is basically just,
"warn if your query is confusing".  Why do I want that?  Why would I
not want that?  What is the alternative?  What are some examples?
Shouldn't there be a standard behavior without a need to configure
anything?

In allfiles.sgml, dropVariable should be before dropView.




Re: Schema variables - new implementation for Postgres 15

From
Peter Eisentraut
Date:
On 17.03.23 21:50, Pavel Stehule wrote:
> Hi
> 
> rebase + fix-update pg_dump tests
> 
> Regards
> 
> Pavel
> 

I have spent several hours studying the code and the past discussions.

The problem I see in general is that everyone who reviews and tests the 
patches finds more problems, behavioral, weird internal errors, crashes. 
  These are then immediately fixed, and the cycle starts again.  I don't 
have the sense that this process has arrived at a steady state yet.

The other issue is that by its nature this patch adds a lot of code in a 
lot of places.  Large patches are more likely to be successful if they 
add a lot of code in one place or smaller amounts of code in a lot of 
places.  But this patch does both and it's just overwhelming.  There is 
so much new internal functionality and terminology.  Variables can be 
created, registered, initialized, stored, copied, prepared, set, freed, 
removed, released, synced, dropped, and more.  I don't know if anyone 
has actually reviewed all that in detail.

Has any effort been made to make this simpler, smaller, reduce scope, 
refactoring, find commonalities with other features, try to manage the 
complexity somehow?

I'm not making a comment on the details of the functionality itself.  I 
just think on the coding level it's not gotten to a satisfying situation 
yet.




Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi


čt 23. 3. 2023 v 16:33 odesílatel Peter Eisentraut <peter.eisentraut@enterprisedb.com> napsal:
On 17.03.23 21:50, Pavel Stehule wrote:
> Hi
>
> rebase + fix-update pg_dump tests
>
> Regards
>
> Pavel
>

I have spent several hours studying the code and the past discussions.

The problem I see in general is that everyone who reviews and tests the
patches finds more problems, behavioral, weird internal errors, crashes.
  These are then immediately fixed, and the cycle starts again.  I don't
have the sense that this process has arrived at a steady state yet.

The other issue is that by its nature this patch adds a lot of code in a
lot of places.  Large patches are more likely to be successful if they
add a lot of code in one place or smaller amounts of code in a lot of
places.  But this patch does both and it's just overwhelming.  There is
so much new internal functionality and terminology.  Variables can be
created, registered, initialized, stored, copied, prepared, set, freed,
removed, released, synced, dropped, and more.  I don't know if anyone
has actually reviewed all that in detail.

Has any effort been made to make this simpler, smaller, reduce scope,
refactoring, find commonalities with other features, try to manage the
complexity somehow?

I'm not making a comment on the details of the functionality itself.  I
just think on the coding level it's not gotten to a satisfying situation
yet.


I agree that this patch is large, but almost all code is simple. Complex code is "only" in 0002-session-variables.patch (113KB/438KB).

Now, I have no idea how the functionality can be sensibly reduced or divided (no without significant performance loss). I see two difficult points in this code:

1. when to clean memory. The code implements cleaning very accurately - and this is unique in Postgres. Partially I implement some functionality of storage manager. Probably no code from Postgres can be reused, because there is not any support for global temporary objects. Cleaning based on sinval messages processing is difficult, but there is nothing else.  The code is a little bit more complex, because there are three types of session variables: a) session variables, b) temp session variables, c) session variables with transaction scope. Maybe @c can be removed, and maybe we don't need to support not null default (this can simplify initialization). What do you think about it?

2. how to pass a variable's value to the executor. The implementation is based on extending the Param node, but it cannot reuse query params buffers and implements own.
But it is hard to simplify code, because we want to support usage variables in queries, and usage in PL/pgSQL expressions too. And both are processed differently.

Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


čt 23. 3. 2023 v 19:54 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi


čt 23. 3. 2023 v 16:33 odesílatel Peter Eisentraut <peter.eisentraut@enterprisedb.com> napsal:
On 17.03.23 21:50, Pavel Stehule wrote:
> Hi
>
> rebase + fix-update pg_dump tests
>
> Regards
>
> Pavel
>

I have spent several hours studying the code and the past discussions.

The problem I see in general is that everyone who reviews and tests the
patches finds more problems, behavioral, weird internal errors, crashes.
  These are then immediately fixed, and the cycle starts again.  I don't
have the sense that this process has arrived at a steady state yet.

The other issue is that by its nature this patch adds a lot of code in a
lot of places.  Large patches are more likely to be successful if they
add a lot of code in one place or smaller amounts of code in a lot of
places.  But this patch does both and it's just overwhelming.  There is
so much new internal functionality and terminology.  Variables can be
created, registered, initialized, stored, copied, prepared, set, freed,
removed, released, synced, dropped, and more.  I don't know if anyone
has actually reviewed all that in detail.

Has any effort been made to make this simpler, smaller, reduce scope,
refactoring, find commonalities with other features, try to manage the
complexity somehow?

I'm not making a comment on the details of the functionality itself.  I
just think on the coding level it's not gotten to a satisfying situation
yet.


I agree that this patch is large, but almost all code is simple. Complex code is "only" in 0002-session-variables.patch (113KB/438KB).

Now, I have no idea how the functionality can be sensibly reduced or divided (no without significant performance loss). I see two difficult points in this code:

1. when to clean memory. The code implements cleaning very accurately - and this is unique in Postgres. Partially I implement some functionality of storage manager. Probably no code from Postgres can be reused, because there is not any support for global temporary objects. Cleaning based on sinval messages processing is difficult, but there is nothing else.  The code is a little bit more complex, because there are three types of session variables: a) session variables, b) temp session variables, c) session variables with transaction scope. Maybe @c can be removed, and maybe we don't need to support not null default (this can simplify initialization). What do you think about it?

2. how to pass a variable's value to the executor. The implementation is based on extending the Param node, but it cannot reuse query params buffers and implements own.
But it is hard to simplify code, because we want to support usage variables in queries, and usage in PL/pgSQL expressions too. And both are processed differently.

Maybe I can divide the  patch 0002-session-variables to three sections - related to memory management, planning and execution?

Regards

Pavel


Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


út 21. 3. 2023 v 17:18 odesílatel Peter Eisentraut <peter.eisentraut@enterprisedb.com> napsal:
On 17.03.23 21:50, Pavel Stehule wrote:
> rebase + fix-update pg_dump tests

I have a few comments on the code:

0001

ExecGrant_Variable() could probably use ExecGrant_common().

done


The additions to syscache.c should be formatted to the new style.

done
 

in pg_variable.h:

 

- create_lsn ought to have a "var" prefix.

changed
 

- typo: "typmode for variable's type"

fixed
 

- What is the purpose of struct Variable?  It seems very similar to
   FormData_pg_variable.  At least a comment would be useful.

I wrote comment there:


/*
 * The Variable struct is based on FormData_pg_variable struct. Against
 * FormData_pg_variable it can hold node of deserialized expression used
 * for calculation of default value.
 */
 

Preserve the trailing comma in ParseExprKind.

done
 


0002

expr_kind_allows_session_variables() should have some explanation
about criteria for determining which expression kinds should allow
variables.

I wrote comment there:

 /*
  * Returns true, when expression of kind allows using of
  * session variables.
+ *
+ * The session's variables can be used everywhere where
+ * can be used external parameters. Session variables
+ * are not allowed in DDL. Session's variables cannot be
+ * used in constraints.
+ *
+ * The identifier can be parsed as an session variable
+ * only in expression's kinds where session's variables
+ * are allowed. This is the primary usage of this function.
+ *
+ * Second usage of this function is for decision if
+ * an error message "column does not exist" or "column
+ * or variable does not exist" should be printed. When
+ * we are in expression, where session variables cannot
+ * be used, we raise the first form or error message.
  */


Usually, we handle EXPR_KIND_* switches without default case, so we
get notified what needs to be changed if a new enum symbol is added.

done
 


0010

The material from the tutorial (advanced.sgml) might be better in
ddl.sgml.

moved
 

In catalogs.sgml, the columns don't match the ones actually defined in
pg_variable.h in patch 0001 (e.g., create_lsn is missing and the order
doesn't match).

fixed

 

(The order of columns in pg_variable.h didn't immediately make sense to
me either, so maybe there is a middle ground to be found.)

reordered. Still varcreate_lsn should be before varname column, because sanity check:

--
-- When ALIGNOF_DOUBLE==4 (e.g. AIX), the C ABI may impose 8-byte alignment on
-- some of the C types that correspond to TYPALIGN_DOUBLE SQL types.  To ensure
-- catalog C struct layout matches catalog tuple layout, arrange for the tuple
-- offset of each fixed-width, attalign='d' catalog column to be divisible by 8
-- unconditionally.  Keep such columns before the first NameData column of the
-- catalog, since packagers can override NAMEDATALEN to an odd number.

 

session_variables_ambiguity_warning: There needs to be more
information about this.  The current explanation is basically just,
"warn if your query is confusing".  Why do I want that?  Why would I
not want that?  What is the alternative?  What are some examples?
Shouldn't there be a standard behavior without a need to configure
anything?

I enhanced this entry:

+       <para>
+        The session variables can be shadowed by column references in a query. This
+        is an expected feature. The existing queries should not be broken by creating
+        any session variable, because session variables are shadowed always if the
+        identifier is ambiguous. The variables should be named without possibility
+        to collision with identifiers of other database objects (column names or
+        record field names). The warnings enabled by setting <varname>session_variables_ambiguity_warning</varname>
+        should help with finding identifier's collisions.
+<programlisting>
+CREATE TABLE foo(a int);
+INSERT INTO foo VALUES(10);
+CREATE VARIABLE a int;
+LET a = 100;
+SELECT a FROM foo;
+</programlisting>
+
+<screen>
+ a  
+----
+ 10
+(1 row)
+</screen>
+
+<programlisting>
+SET session_variables_ambiguity_warning TO on;
+SELECT a FROM foo;
+</programlisting>
+
+<screen>
+WARNING:  session variable "a" is shadowed
+LINE 1: SELECT a FROM foo;
+               ^
+DETAIL:  Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name.
+ a  
+----
+ 10
+(1 row)
+</screen>
+       </para>
+       <para>
+        This feature can significantly increase size of logs, and then it is
+        disabled by default, but for testing or development environments it
+        should be enabled.

 

In allfiles.sgml, dropVariable should be before dropView.

fixed

Regards

Pavel


Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

I just have a few minor wording improvements for the various comments /
documentation you quoted.

On Sun, Mar 26, 2023 at 08:53:49AM +0200, Pavel Stehule wrote:
> út 21. 3. 2023 v 17:18 odesílatel Peter Eisentraut <
> peter.eisentraut@enterprisedb.com> napsal:
>
> > - What is the purpose of struct Variable?  It seems very similar to
> >    FormData_pg_variable.  At least a comment would be useful.
> >
>
> I wrote comment there:
>
>
> /*
>  * The Variable struct is based on FormData_pg_variable struct. Against
>  * FormData_pg_variable it can hold node of deserialized expression used
>  * for calculation of default value.
>  */

Did you mean "Unlike" rather than "Against"?

> > 0002
> >
> > expr_kind_allows_session_variables() should have some explanation
> > about criteria for determining which expression kinds should allow
> > variables.
> >
>
> I wrote comment there:
>
>  /*
>   * Returns true, when expression of kind allows using of
>   * session variables.
> + * The session's variables can be used everywhere where
> + * can be used external parameters. Session variables
> + * are not allowed in DDL. Session's variables cannot be
> + * used in constraints.
> + *
> + * The identifier can be parsed as an session variable
> + * only in expression's kinds where session's variables
> + * are allowed. This is the primary usage of this function.
> + *
> + * Second usage of this function is for decision if
> + * an error message "column does not exist" or "column
> + * or variable does not exist" should be printed. When
> + * we are in expression, where session variables cannot
> + * be used, we raise the first form or error message.
>   */

Maybe

/*
 * Returns true if the given expression kind is valid for session variables
 * Session variables can be used everywhere where external parameters can be
 * used.  Session variables are not allowed in DDL commands or in constraints.
 *
 * An identifier can be parsed as a session variable only for expression kinds
 * where session variables are allowed. This is the primary usage of this
 * function.
 *
 * Second usage of this function is to decide whether "column does not exist" or
 * "column or variable does not exist" error message should be printed.
 * When we are in an expression where session variables cannot be used, we raise
 * the first form or error message.
 */

> > session_variables_ambiguity_warning: There needs to be more
> > information about this.  The current explanation is basically just,
> > "warn if your query is confusing".  Why do I want that?  Why would I
> > not want that?  What is the alternative?  What are some examples?
> > Shouldn't there be a standard behavior without a need to configure
> > anything?
> >
>
> I enhanced this entry:
>
> +       <para>
> +        The session variables can be shadowed by column references in a
> query. This
> +        is an expected feature. The existing queries should not be broken
> by creating
> +        any session variable, because session variables are shadowed
> always if the
> +        identifier is ambiguous. The variables should be named without
> possibility
> +        to collision with identifiers of other database objects (column
> names or
> +        record field names). The warnings enabled by setting
> <varname>session_variables_ambiguity_warning</varname>
> +        should help with finding identifier's collisions.

Maybe

Session variables can be shadowed by column references in a query, this is an
expected behavior.  Previously working queries shouldn't error out by creating
any session variable, so session variables are always shadowed if an identifier
is ambiguous.  Variables should be referenced using an unambiguous identifier
without any possibility for a collision with identifier of other database
objects (column names or record fields names).  The warning messages emitted
when enabling <varname>session_variables_ambiguity_warning</varname> can help
finding such identifier collision.

> +       </para>
> +       <para>
> +        This feature can significantly increase size of logs, and then it
> is
> +        disabled by default, but for testing or development environments it
> +        should be enabled.

Maybe

This feature can significantly increase log size, so it's disabled by default.
For testing or development environments it's recommended to enable it if you
use session variables.



Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Fri, Mar 24, 2023 at 08:04:08AM +0100, Pavel Stehule wrote:
> čt 23. 3. 2023 v 19:54 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
> napsal:
>
> > čt 23. 3. 2023 v 16:33 odesílatel Peter Eisentraut <
> > peter.eisentraut@enterprisedb.com> napsal:
> >
> >> The other issue is that by its nature this patch adds a lot of code in a
> >> lot of places.  Large patches are more likely to be successful if they
> >> add a lot of code in one place or smaller amounts of code in a lot of
> >> places.  But this patch does both and it's just overwhelming.  There is
> >> so much new internal functionality and terminology.  Variables can be
> >> created, registered, initialized, stored, copied, prepared, set, freed,
> >> removed, released, synced, dropped, and more.  I don't know if anyone
> >> has actually reviewed all that in detail.
> >>
> >> Has any effort been made to make this simpler, smaller, reduce scope,
> >> refactoring, find commonalities with other features, try to manage the
> >> complexity somehow?
> >>
> > I agree that this patch is large, but almost all code is simple. Complex
> > code is "only" in 0002-session-variables.patch (113KB/438KB).
> >
> > Now, I have no idea how the functionality can be sensibly reduced or
> > divided (no without significant performance loss). I see two difficult
> > points in this code:
> >
> > 1. when to clean memory. The code implements cleaning very accurately -
> > and this is unique in Postgres. Partially I implement some functionality of
> > storage manager. Probably no code from Postgres can be reused, because
> > there is not any support for global temporary objects. Cleaning based on
> > sinval messages processing is difficult, but there is nothing else.  The
> > code is a little bit more complex, because there are three types of session
> > variables: a) session variables, b) temp session variables, c) session
> > variables with transaction scope. Maybe @c can be removed, and maybe we
> > don't need to support not null default (this can simplify initialization).
> > What do you think about it?
> >
> > 2. how to pass a variable's value to the executor. The implementation is
> > based on extending the Param node, but it cannot reuse query params buffers
> > and implements own.
> > But it is hard to simplify code, because we want to support usage
> > variables in queries, and usage in PL/pgSQL expressions too. And both are
> > processed differently.
> >
>
> Maybe I can divide the  patch 0002-session-variables to three sections -
> related to memory management, planning and execution?

I agree, the patch scale is a bit overwhelming. It's worth noting that
due to the nature of this change certain heavy lifting has to be done in
any case, plus I've got an impression that some part of the patch are
quite solid (although I haven't reviewed everything, did anyone achieve
that milestone?). But still, it would be of great help to simplify the
current implementation, and I'm afraid the only way of doing this is to
make trades-off about functionality vs change size & complexity.

Maybe instead splitting the patch into implementation components, it's
possible to split it feature-by-feature, where every single patch would
represent an independent (to a certain degree) functionality? I have in
mind something like: catalog changes; base implementation; ACL support;
xact actions implementation (on commit drop, etc); variables with
default value; shadowing; etc. If such approach is possible, it will
give us: flexibility to apply only a subset of the whole patch series;
some understanding how much complexity is coming from each feature. What
do you think about this idea?

I also recall somewhere earlier in the thread Pavel has mentioned that a
transactional version of session variables patch would be actually
simpler, and he has plans to implement it later on. Is there another
trade-off on the table we could think of, transactional vs
non-transactional session variables?



Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Sun, Mar 26, 2023 at 07:32:05PM +0800, Julien Rouhaud wrote:
> Hi,
>
> I just have a few minor wording improvements for the various comments /
> documentation you quoted.

Talking about documentation I've noticed that the implementation
contains few limitations, that are not mentioned in the docs. Examples
are WITH queries:

    WITH x AS (LET public.svar = 100) SELECT * FROM x;
    ERROR:  LET not supported in WITH query

and using with set-returning functions (haven't found any related tests).

Another small note is about this change in the rowsecurity:

        /*
    -     * For SELECT, UPDATE and DELETE, add security quals to enforce the USING
    -     * policies.  These security quals control access to existing table rows.
    -     * Restrictive policies are combined together using AND, and permissive
    -     * policies are combined together using OR.
    +     * For SELECT, LET, UPDATE and DELETE, add security quals to enforce the
    +     * USING policies.  These security quals control access to existing table
    +     * rows. Restrictive policies are combined together using AND, and
    +     * permissive policies are combined together using OR.
         */

From this commentary one may think that LET command supports row level
security, but I don't see it being implemented. A wrong commentary?



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 26. 3. 2023 v 13:32 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

I just have a few minor wording improvements for the various comments /
documentation you quoted.

On Sun, Mar 26, 2023 at 08:53:49AM +0200, Pavel Stehule wrote:
> út 21. 3. 2023 v 17:18 odesílatel Peter Eisentraut <
> peter.eisentraut@enterprisedb.com> napsal:
>
> > - What is the purpose of struct Variable?  It seems very similar to
> >    FormData_pg_variable.  At least a comment would be useful.
> >
>
> I wrote comment there:
>
>
> /*
>  * The Variable struct is based on FormData_pg_variable struct. Against
>  * FormData_pg_variable it can hold node of deserialized expression used
>  * for calculation of default value.
>  */

Did you mean "Unlike" rather than "Against"?

fixed
 

> > 0002
> >
> > expr_kind_allows_session_variables() should have some explanation
> > about criteria for determining which expression kinds should allow
> > variables.
> >
>
> I wrote comment there:
>
>  /*
>   * Returns true, when expression of kind allows using of
>   * session variables.
> + * The session's variables can be used everywhere where
> + * can be used external parameters. Session variables
> + * are not allowed in DDL. Session's variables cannot be
> + * used in constraints.
> + *
> + * The identifier can be parsed as an session variable
> + * only in expression's kinds where session's variables
> + * are allowed. This is the primary usage of this function.
> + *
> + * Second usage of this function is for decision if
> + * an error message "column does not exist" or "column
> + * or variable does not exist" should be printed. When
> + * we are in expression, where session variables cannot
> + * be used, we raise the first form or error message.
>   */

Maybe

/*
 * Returns true if the given expression kind is valid for session variables
 * Session variables can be used everywhere where external parameters can be
 * used.  Session variables are not allowed in DDL commands or in constraints.
 *
 * An identifier can be parsed as a session variable only for expression kinds
 * where session variables are allowed. This is the primary usage of this
 * function.
 *
 * Second usage of this function is to decide whether "column does not exist" or
 * "column or variable does not exist" error message should be printed.
 * When we are in an expression where session variables cannot be used, we raise
 * the first form or error message.
 */

changed
 

> > session_variables_ambiguity_warning: There needs to be more
> > information about this.  The current explanation is basically just,
> > "warn if your query is confusing".  Why do I want that?  Why would I
> > not want that?  What is the alternative?  What are some examples?
> > Shouldn't there be a standard behavior without a need to configure
> > anything?
> >
>
> I enhanced this entry:
>
> +       <para>
> +        The session variables can be shadowed by column references in a
> query. This
> +        is an expected feature. The existing queries should not be broken
> by creating
> +        any session variable, because session variables are shadowed
> always if the
> +        identifier is ambiguous. The variables should be named without
> possibility
> +        to collision with identifiers of other database objects (column
> names or
> +        record field names). The warnings enabled by setting
> <varname>session_variables_ambiguity_warning</varname>
> +        should help with finding identifier's collisions.

Maybe

Session variables can be shadowed by column references in a query, this is an
expected behavior.  Previously working queries shouldn't error out by creating
any session variable, so session variables are always shadowed if an identifier
is ambiguous.  Variables should be referenced using an unambiguous identifier
without any possibility for a collision with identifier of other database
objects (column names or record fields names).  The warning messages emitted
when enabling <varname>session_variables_ambiguity_warning</varname> can help
finding such identifier collision.

> +       </para>
> +       <para>
> +        This feature can significantly increase size of logs, and then it
> is
> +        disabled by default, but for testing or development environments it
> +        should be enabled.

Maybe

This feature can significantly increase log size, so it's disabled by default.
For testing or development environments it's recommended to enable it if you
use session variables.

replaced

Thank you very much for these language correctures

Regards

Pavel

p.s. I'll send updated patch after today or tomorrow - I have to fix broken dependency check after rebase

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

ne 26. 3. 2023 v 19:53 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Sun, Mar 26, 2023 at 07:32:05PM +0800, Julien Rouhaud wrote:
> Hi,
>
> I just have a few minor wording improvements for the various comments /
> documentation you quoted.

Talking about documentation I've noticed that the implementation
contains few limitations, that are not mentioned in the docs. Examples
are WITH queries:

    WITH x AS (LET public.svar = 100) SELECT * FROM x;
    ERROR:  LET not supported in WITH query

 The LET statement doesn't support the RETURNING clause, so using inside CTE does not make any sense.

Do you have some tips, where this behaviour should be mentioned?


and using with set-returning functions (haven't found any related tests).

There it is:

+CREATE VARIABLE public.svar AS int;
+-- should be ok
+LET public.svar = generate_series(1, 1);
+-- should fail
+LET public.svar = generate_series(1, 2);
+ERROR:  expression returned more than one row
+LET public.svar = generate_series(1, 0);
+ERROR:  expression returned no rows
+DROP VARIABLE public.svar;
 

Another small note is about this change in the rowsecurity:

        /*
    -    * For SELECT, UPDATE and DELETE, add security quals to enforce the USING
    -    * policies.  These security quals control access to existing table rows.
    -    * Restrictive policies are combined together using AND, and permissive
    -    * policies are combined together using OR.
    +    * For SELECT, LET, UPDATE and DELETE, add security quals to enforce the
    +    * USING policies.  These security quals control access to existing table
    +    * rows. Restrictive policies are combined together using AND, and
    +    * permissive policies are combined together using OR.
         */

From this commentary one may think that LET command supports row level
security, but I don't see it being implemented. A wrong commentary?

I don't think so.  The row level security should be supported. I tested it on example from doc:

CREATE TABLE public.accounts (
    manager text,
    company text,
    contact_email text
);

CREATE VARIABLE public.v AS text;

COPY public.accounts (manager, company, contact_email) FROM stdin;
t1role xxx t1role@xxx.org
t2role yyy t2role@yyy.org
\.

CREATE POLICY account_managers ON public.accounts USING ((manager = CURRENT_USER));
ALTER TABLE public.accounts ENABLE ROW LEVEL SECURITY;

GRANT SELECT,INSERT ON TABLE public.accounts TO t1role;
GRANT SELECT,INSERT ON TABLE public.accounts TO t2role;

GRANT ALL ON VARIABLE public.v TO t1role;
GRANT ALL ON VARIABLE public.v TO t2role;


[pavel@localhost postgresql.master]$ psql
Assertions: on
psql (16devel)
Type "help" for help.

(2023-03-28 21:32:33) postgres=# set role to t1role;
SET
(2023-03-28 21:32:40) postgres=# select * from accounts ;
┌─────────┬─────────┬────────────────┐
│ manager │ company │ contact_email  │
╞═════════╪═════════╪════════════════╡
│ t1role  │ xxx     │ t1role@xxx.org
└─────────┴─────────┴────────────────┘
(1 row)

(2023-03-28 21:32:45) postgres=# let v = (select company from accounts);
LET
(2023-03-28 21:32:58) postgres=# select v;
┌─────┐
│  v  │
╞═════╡
│ xxx │
└─────┘
(1 row)

(2023-03-28 21:33:03) postgres=# set role to default;
SET
(2023-03-28 21:33:12) postgres=# set role to t2role;
SET
(2023-03-28 21:33:19) postgres=# select * from accounts ;
┌─────────┬─────────┬────────────────┐
│ manager │ company │ contact_email  │
╞═════════╪═════════╪════════════════╡
│ t2role  │ yyy     │ t2role@yyy.org
└─────────┴─────────┴────────────────┘
(1 row)

(2023-03-28 21:33:22) postgres=# let v = (select company from accounts);
LET
(2023-03-28 21:33:26) postgres=# select v;
┌─────┐
│  v  │
╞═════╡
│ yyy │
└─────┘
(1 row)



Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 26. 3. 2023 v 13:32 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

I just have a few minor wording improvements for the various comments /
documentation you quoted.

On Sun, Mar 26, 2023 at 08:53:49AM +0200, Pavel Stehule wrote:
> út 21. 3. 2023 v 17:18 odesílatel Peter Eisentraut <
> peter.eisentraut@enterprisedb.com> napsal:
>
> > - What is the purpose of struct Variable?  It seems very similar to
> >    FormData_pg_variable.  At least a comment would be useful.
> >
>
> I wrote comment there:
>
>
> /*
>  * The Variable struct is based on FormData_pg_variable struct. Against
>  * FormData_pg_variable it can hold node of deserialized expression used
>  * for calculation of default value.
>  */

Did you mean "Unlike" rather than "Against"?

> > 0002
> >
> > expr_kind_allows_session_variables() should have some explanation
> > about criteria for determining which expression kinds should allow
> > variables.
> >
>
> I wrote comment there:
>
>  /*
>   * Returns true, when expression of kind allows using of
>   * session variables.
> + * The session's variables can be used everywhere where
> + * can be used external parameters. Session variables
> + * are not allowed in DDL. Session's variables cannot be
> + * used in constraints.
> + *
> + * The identifier can be parsed as an session variable
> + * only in expression's kinds where session's variables
> + * are allowed. This is the primary usage of this function.
> + *
> + * Second usage of this function is for decision if
> + * an error message "column does not exist" or "column
> + * or variable does not exist" should be printed. When
> + * we are in expression, where session variables cannot
> + * be used, we raise the first form or error message.
>   */

Maybe

/*
 * Returns true if the given expression kind is valid for session variables
 * Session variables can be used everywhere where external parameters can be
 * used.  Session variables are not allowed in DDL commands or in constraints.
 *
 * An identifier can be parsed as a session variable only for expression kinds
 * where session variables are allowed. This is the primary usage of this
 * function.
 *
 * Second usage of this function is to decide whether "column does not exist" or
 * "column or variable does not exist" error message should be printed.
 * When we are in an expression where session variables cannot be used, we raise
 * the first form or error message.
 */

> > session_variables_ambiguity_warning: There needs to be more
> > information about this.  The current explanation is basically just,
> > "warn if your query is confusing".  Why do I want that?  Why would I
> > not want that?  What is the alternative?  What are some examples?
> > Shouldn't there be a standard behavior without a need to configure
> > anything?
> >
>
> I enhanced this entry:
>
> +       <para>
> +        The session variables can be shadowed by column references in a
> query. This
> +        is an expected feature. The existing queries should not be broken
> by creating
> +        any session variable, because session variables are shadowed
> always if the
> +        identifier is ambiguous. The variables should be named without
> possibility
> +        to collision with identifiers of other database objects (column
> names or
> +        record field names). The warnings enabled by setting
> <varname>session_variables_ambiguity_warning</varname>
> +        should help with finding identifier's collisions.

Maybe

Session variables can be shadowed by column references in a query, this is an
expected behavior.  Previously working queries shouldn't error out by creating
any session variable, so session variables are always shadowed if an identifier
is ambiguous.  Variables should be referenced using an unambiguous identifier
without any possibility for a collision with identifier of other database
objects (column names or record fields names).  The warning messages emitted
when enabling <varname>session_variables_ambiguity_warning</varname> can help
finding such identifier collision.

> +       </para>
> +       <para>
> +        This feature can significantly increase size of logs, and then it
> is
> +        disabled by default, but for testing or development environments it
> +        should be enabled.

Maybe

This feature can significantly increase log size, so it's disabled by default.
For testing or development environments it's recommended to enable it if you
use session variables.

with language correctures

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Peter Eisentraut
Date:
On 24.03.23 08:04, Pavel Stehule wrote:
> Maybe I can divide the  patch 0002-session-variables to three sections - 
> related to memory management, planning and execution?

Personally, I find the existing split not helpful.  There is no value 
(to me) in putting code, documentation, and tests in three separate 
patches.  This is in fact counter-helpful (to me).  Things like the 
DISCARD command (0005) and the error messages changes (0009) can be 
separate patches, but most of the rest should probably be a single patch.

I know you have been asked earlier in the thread to provide smaller 
patches, so don't change it just for me, but this is my opinion.




Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

ne 26. 3. 2023 v 19:44 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Fri, Mar 24, 2023 at 08:04:08AM +0100, Pavel Stehule wrote:
> čt 23. 3. 2023 v 19:54 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
> napsal:
>
> > čt 23. 3. 2023 v 16:33 odesílatel Peter Eisentraut <
> > peter.eisentraut@enterprisedb.com> napsal:
> >
> >> The other issue is that by its nature this patch adds a lot of code in a
> >> lot of places.  Large patches are more likely to be successful if they
> >> add a lot of code in one place or smaller amounts of code in a lot of
> >> places.  But this patch does both and it's just overwhelming.  There is
> >> so much new internal functionality and terminology.  Variables can be
> >> created, registered, initialized, stored, copied, prepared, set, freed,
> >> removed, released, synced, dropped, and more.  I don't know if anyone
> >> has actually reviewed all that in detail.
> >>
> >> Has any effort been made to make this simpler, smaller, reduce scope,
> >> refactoring, find commonalities with other features, try to manage the
> >> complexity somehow?
> >>
> > I agree that this patch is large, but almost all code is simple. Complex
> > code is "only" in 0002-session-variables.patch (113KB/438KB).
> >
> > Now, I have no idea how the functionality can be sensibly reduced or
> > divided (no without significant performance loss). I see two difficult
> > points in this code:
> >
> > 1. when to clean memory. The code implements cleaning very accurately -
> > and this is unique in Postgres. Partially I implement some functionality of
> > storage manager. Probably no code from Postgres can be reused, because
> > there is not any support for global temporary objects. Cleaning based on
> > sinval messages processing is difficult, but there is nothing else.  The
> > code is a little bit more complex, because there are three types of session
> > variables: a) session variables, b) temp session variables, c) session
> > variables with transaction scope. Maybe @c can be removed, and maybe we
> > don't need to support not null default (this can simplify initialization).
> > What do you think about it?
> >
> > 2. how to pass a variable's value to the executor. The implementation is
> > based on extending the Param node, but it cannot reuse query params buffers
> > and implements own.
> > But it is hard to simplify code, because we want to support usage
> > variables in queries, and usage in PL/pgSQL expressions too. And both are
> > processed differently.
> >
>
> Maybe I can divide the  patch 0002-session-variables to three sections -
> related to memory management, planning and execution?

I agree, the patch scale is a bit overwhelming. It's worth noting that
due to the nature of this change certain heavy lifting has to be done in
any case, plus I've got an impression that some part of the patch are
quite solid (although I haven't reviewed everything, did anyone achieve
that milestone?). But still, it would be of great help to simplify the
current implementation, and I'm afraid the only way of doing this is to
make trades-off about functionality vs change size & complexity.

There is not too much space for reduction - more - sometimes there is code reuse between features.

I can reduce temporary session variables, but the same AtSubXact routines are used by memory purging routines, and if only if  you drop all dependent features, then you can get some interesting number of reduced lines. I can imagine very reduced feature set like

1) no temporary variables, no reset at transaction end
2) without default expressions - default is null
3) direct memory cleaning on drop (without possibility of saved value after reverted drop) or cleaning at session end always

Note - @1 and @3 shares code

This reduced implementation can still be useful. Probably it doesn't reduce too much code, but it can reduce non trivial code. I believe so almost all not reduced code will be almost trivial

 

Maybe instead splitting the patch into implementation components, it's
possible to split it feature-by-feature, where every single patch would
represent an independent (to a certain degree) functionality? I have in
mind something like: catalog changes; base implementation; ACL support;
xact actions implementation (on commit drop, etc); variables with
default value; shadowing; etc. If such approach is possible, it will
give us: flexibility to apply only a subset of the whole patch series;
some understanding how much complexity is coming from each feature. What
do you think about this idea?

I think cleaning, dropping can be moved to a separate patch. ACL support uses generic support (it is only a few lines).

The patch 02 can be splitted - I am not sure how these parts can be independent. I'll try to split this patch, and we will see if it will be better.



I also recall somewhere earlier in the thread Pavel has mentioned that a
transactional version of session variables patch would be actually
simpler, and he has plans to implement it later on. Is there another
trade-off on the table we could think of, transactional vs
non-transactional session variables?

Maybe I didn't use the correct words.  Implementation of transactional behaviour can be relatively simple, but only if there is support for non- transactional behaviour already.

The transactional variables need a little bit more code, because you should implement mvcc. Current implementation is partially transactional - there are supported transactions and sub-transactions on catalog (and related memory cleaning), the variables by themselves are not transactional. Implementing mvcc is not too difficult - because there are already routines related to handling subtransactions. But it increases the complexity of these routines, so I postponed support for transactional variables to the next step.

Regards

Pavel
 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

st 29. 3. 2023 v 12:17 odesílatel Peter Eisentraut <peter.eisentraut@enterprisedb.com> napsal:
On 24.03.23 08:04, Pavel Stehule wrote:
> Maybe I can divide the  patch 0002-session-variables to three sections -
> related to memory management, planning and execution?

Personally, I find the existing split not helpful.  There is no value
(to me) in putting code, documentation, and tests in three separate
patches.  This is in fact counter-helpful (to me).  Things like the
DISCARD command (0005) and the error messages changes (0009) can be
separate patches, but most of the rest should probably be a single patch.

I know you have been asked earlier in the thread to provide smaller
patches, so don't change it just for me, but this is my opinion.

If I reorganize the patch to the following structure, can be it useful for you?

1. really basic functionality (no temporary variables, no def expressions, no memory cleaning)
   SELECT variable
   LET should be supported + doc, + related tests.

2. support for temporary variables (session, transaction scope),
    memory cleaning at the end of transaction

3. PL/pgSQL support
4. pg_dump
5. shadowing warning
6. ... others ...

Can it be better for you?

Regards

Pavel


Re: Schema variables - new implementation for Postgres 15

From
Peter Eisentraut
Date:
On 30.03.23 10:49, Pavel Stehule wrote:
> If I reorganize the patch to the following structure, can be it useful 
> for you?
> 
> 1. really basic functionality (no temporary variables, no def 
> expressions, no memory cleaning)
>     SELECT variable
>     LET should be supported + doc, + related tests.
> 
> 2. support for temporary variables (session, transaction scope),
>      memory cleaning at the end of transaction
> 
> 3. PL/pgSQL support
> 4. pg_dump
> 5. shadowing warning
> 6. ... others ...

That seems like an ok approach.  The pg_dump support should probably go 
into the first patch, so it's self-contained.



Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Tue, Mar 28, 2023 at 09:34:20PM +0200, Pavel Stehule wrote:
> Hi
>
> > Talking about documentation I've noticed that the implementation
> > contains few limitations, that are not mentioned in the docs. Examples
> > are WITH queries:
> >
> >     WITH x AS (LET public.svar = 100) SELECT * FROM x;
> >     ERROR:  LET not supported in WITH query
> >
>
>  The LET statement doesn't support the RETURNING clause, so using inside
> CTE does not make any sense.
>
> Do you have some tips, where this behaviour should be mentioned?

Yeah, you're right, it's probably not worth adding. I usually find it a
good idea to explicitly mention any limitations, but WITH docs are
actually have one line about statements without the RETURNING clause,
plus indeed for LET it makes even less sense.

> > and using with set-returning functions (haven't found any related tests).
> >
>
> There it is:
>
> +CREATE VARIABLE public.svar AS int;
> +-- should be ok
> +LET public.svar = generate_series(1, 1);
> +-- should fail
> +LET public.svar = generate_series(1, 2);
> +ERROR:  expression returned more than one row
> +LET public.svar = generate_series(1, 0);
> +ERROR:  expression returned no rows
> +DROP VARIABLE public.svar;

Oh, interesting. I was looking for another error message from
parse_func.c:

    set-returning functions are not allowed in LET assignment expression

Is this one you've posted somehow different?

> > Another small note is about this change in the rowsecurity:
> >
> >         /*
> >     -    * For SELECT, UPDATE and DELETE, add security quals to enforce
> > the USING
> >     -    * policies.  These security quals control access to existing
> > table rows.
> >     -    * Restrictive policies are combined together using AND, and
> > permissive
> >     -    * policies are combined together using OR.
> >     +    * For SELECT, LET, UPDATE and DELETE, add security quals to
> > enforce the
> >     +    * USING policies.  These security quals control access to
> > existing table
> >     +    * rows. Restrictive policies are combined together using AND, and
> >     +    * permissive policies are combined together using OR.
> >          */
> >
> > From this commentary one may think that LET command supports row level
> > security, but I don't see it being implemented. A wrong commentary?
> >
>
> I don't think so.  The row level security should be supported. I tested it
> on example from doc:
>
> [...]
>
> (2023-03-28 21:32:33) postgres=# set role to t1role;
> SET
> (2023-03-28 21:32:40) postgres=# select * from accounts ;
> ┌─────────┬─────────┬────────────────┐
> │ manager │ company │ contact_email  │
> ╞═════════╪═════════╪════════════════╡
> │ t1role  │ xxx     │ t1role@xxx.org │
> └─────────┴─────────┴────────────────┘
> (1 row)
>
> (2023-03-28 21:32:45) postgres=# let v = (select company from accounts);
> LET
> (2023-03-28 21:32:58) postgres=# select v;
> ┌─────┐
> │  v  │
> ╞═════╡
> │ xxx │
> └─────┘
> (1 row)
>
> (2023-03-28 21:33:03) postgres=# set role to default;
> SET
> (2023-03-28 21:33:12) postgres=# set role to t2role;
> SET
> (2023-03-28 21:33:19) postgres=# select * from accounts ;
> ┌─────────┬─────────┬────────────────┐
> │ manager │ company │ contact_email  │
> ╞═════════╪═════════╪════════════════╡
> │ t2role  │ yyy     │ t2role@yyy.org │
> └─────────┴─────────┴────────────────┘
> (1 row)
>
> (2023-03-28 21:33:22) postgres=# let v = (select company from accounts);
> LET
> (2023-03-28 21:33:26) postgres=# select v;
> ┌─────┐
> │  v  │
> ╞═════╡
> │ yyy │
> └─────┘
> (1 row)

Hm, but isn't the row level security enforced here on the select level,
not when assigning some value via LET? Plus, it seems the comment
originally refer to the command types (CMD_SELECT, etc), and there is no
CMD_LET and no need for it, right?

I'm just trying to understand if there was anything special done for
session variables in this regard, and if not, the commentary change
seems to be not needed (I know, I know, it's totally nitpicking).



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 31. 3. 2023 v 21:31 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Tue, Mar 28, 2023 at 09:34:20PM +0200, Pavel Stehule wrote:
> Hi
>
> > Talking about documentation I've noticed that the implementation
> > contains few limitations, that are not mentioned in the docs. Examples
> > are WITH queries:
> >
> >     WITH x AS (LET public.svar = 100) SELECT * FROM x;
> >     ERROR:  LET not supported in WITH query
> >
>
>  The LET statement doesn't support the RETURNING clause, so using inside
> CTE does not make any sense.
>
> Do you have some tips, where this behaviour should be mentioned?

Yeah, you're right, it's probably not worth adding. I usually find it a
good idea to explicitly mention any limitations, but WITH docs are
actually have one line about statements without the RETURNING clause,
plus indeed for LET it makes even less sense.

> > and using with set-returning functions (haven't found any related tests).
> >
>
> There it is:
>
> +CREATE VARIABLE public.svar AS int;
> +-- should be ok
> +LET public.svar = generate_series(1, 1);
> +-- should fail
> +LET public.svar = generate_series(1, 2);
> +ERROR:  expression returned more than one row
> +LET public.svar = generate_series(1, 0);
> +ERROR:  expression returned no rows
> +DROP VARIABLE public.svar;

Oh, interesting. I was looking for another error message from
parse_func.c:

    set-returning functions are not allowed in LET assignment expression

Is this one you've posted somehow different?

This limit is correct, but the error message is maybe messy - I changed it.

This is protection against:

(2023-04-01 06:25:50) postgres=# create variable xxx as int[];
CREATE VARIABLE
(2023-04-01 06:26:02) postgres=# let xxx[generate_series(1,3)] = 10;
ERROR:  set-returning functions are not allowed in LET assignment expression
LINE 1: let xxx[generate_series(1,3)] = 10;
                ^

change:
        case EXPR_KIND_LET_TARGET:
-           err = _("set-returning functions are not allowed in LET assignment expression");
+           err = _("set-returning functions are not allowed in LET target expression");
            break;

This case was not tested - so I did new test for this case


> > Another small note is about this change in the rowsecurity:
> >
> >         /*
> >     -    * For SELECT, UPDATE and DELETE, add security quals to enforce
> > the USING
> >     -    * policies.  These security quals control access to existing
> > table rows.
> >     -    * Restrictive policies are combined together using AND, and
> > permissive
> >     -    * policies are combined together using OR.
> >     +    * For SELECT, LET, UPDATE and DELETE, add security quals to
> > enforce the
> >     +    * USING policies.  These security quals control access to
> > existing table
> >     +    * rows. Restrictive policies are combined together using AND, and
> >     +    * permissive policies are combined together using OR.
> >          */
> >
> > From this commentary one may think that LET command supports row level
> > security, but I don't see it being implemented. A wrong commentary?
> >
>
> I don't think so.  The row level security should be supported. I tested it
> on example from doc:
>
> [...]
>
> (2023-03-28 21:32:33) postgres=# set role to t1role;
> SET
> (2023-03-28 21:32:40) postgres=# select * from accounts ;
> ┌─────────┬─────────┬────────────────┐
> │ manager │ company │ contact_email  │
> ╞═════════╪═════════╪════════════════╡
> │ t1role  │ xxx     │ t1role@xxx.org
> └─────────┴─────────┴────────────────┘
> (1 row)
>
> (2023-03-28 21:32:45) postgres=# let v = (select company from accounts);
> LET
> (2023-03-28 21:32:58) postgres=# select v;
> ┌─────┐
> │  v  │
> ╞═════╡
> │ xxx │
> └─────┘
> (1 row)
>
> (2023-03-28 21:33:03) postgres=# set role to default;
> SET
> (2023-03-28 21:33:12) postgres=# set role to t2role;
> SET
> (2023-03-28 21:33:19) postgres=# select * from accounts ;
> ┌─────────┬─────────┬────────────────┐
> │ manager │ company │ contact_email  │
> ╞═════════╪═════════╪════════════════╡
> │ t2role  │ yyy     │ t2role@yyy.org
> └─────────┴─────────┴────────────────┘
> (1 row)
>
> (2023-03-28 21:33:22) postgres=# let v = (select company from accounts);
> LET
> (2023-03-28 21:33:26) postgres=# select v;
> ┌─────┐
> │  v  │
> ╞═════╡
> │ yyy │
> └─────┘
> (1 row)

Hm, but isn't the row level security enforced here on the select level,
not when assigning some value via LET? Plus, it seems the comment
originally refer to the command types (CMD_SELECT, etc), and there is no
CMD_LET and no need for it, right?

I'm just trying to understand if there was anything special done for
session variables in this regard, and if not, the commentary change
seems to be not needed (I know, I know, it's totally nitpicking).

I am not sure at this point.  It is true, so it doesn't modify any lines there, and this is the reason why this comment is maybe messy.

I'll remove it.

p.s. I am sending an updated patch still in the old format. Refactoring to a new format for Peter can take some time, and the patch in the old format can be available for people who can do some tests or some checks.



Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Greg Stark
Date:
On Sun, 26 Mar 2023 at 07:34, Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> This feature can significantly increase log size, so it's disabled by default.
> For testing or development environments it's recommended to enable it if you
> use session variables.

I think it's generally not practical to have warnings for valid DML.
Effectively warnings in DML are errors since they make the syntax just
unusable. I suppose it's feasible to have it as a debugging option
that defaults to off but I'm not sure it's really useful.

I suppose it raises the question of whether session variables should
be in pg_class and be in the same namespace as tables so that
collisions are impossible. I haven't looked at the code to see if
that's feasible or reasonable. But this feels a bit like what happened
with sequences where they used to be a wholly special thing and later
we realized everything was simpler if they were just a kind of
relation.

-- 
greg



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 5. 4. 2023 v 19:20 odesílatel Greg Stark <stark@mit.edu> napsal:
On Sun, 26 Mar 2023 at 07:34, Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> This feature can significantly increase log size, so it's disabled by default.
> For testing or development environments it's recommended to enable it if you
> use session variables.

I think it's generally not practical to have warnings for valid DML.
Effectively warnings in DML are errors since they make the syntax just
unusable. I suppose it's feasible to have it as a debugging option
that defaults to off but I'm not sure it's really useful.

It is a tool that should help with collision detection.  Without it, it can be pretty hard to detect it. It is similar to plpgsql's extra warnings.


I suppose it raises the question of whether session variables should
be in pg_class and be in the same namespace as tables so that
collisions are impossible. I haven't looked at the code to see if
that's feasible or reasonable. But this feels a bit like what happened
with sequences where they used to be a wholly special thing and later
we realized everything was simpler if they were just a kind of
relation.

The first patch did it. But at the end, it doesn't reduce conflicts, because usually the conflicts are between variables and table's attributes (columns).

example

create variable a as int;
create table foo(a int);

select a from foo; -- the "a" is ambiguous, variable "a" is shadowed

This is a basic case, and the unique names don't help. The variables are more aggressive in namespace than tables, because they don't require be in FROM clause. This is the reason why we specify so variables are always shadowed. Only this behaviour is safe and robust. I cannot break any query (that doesn't use variables) by creating any variable. On second hand, an experience from Oracle's PL/SQL or from old PLpgSQL is, so unwanted shadowing can be hard to investigate (without some tools).

PL/pgSQL doesn't allow conflict between PL/pgSQL variables, and SQL (now), and I think so it is best. But the scope of PLpgSQL variables is relatively small, so very strict behaviour is acceptable.

The session variables are some between tables and attributes. The catalog pg_class can be enhanced about columns for variables, but it does a lot now, so I think it is not practical.


Regards

Pavel

 

--
greg

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Thu, Apr 6, 2023 at 1:58 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> st 5. 4. 2023 v 19:20 odesílatel Greg Stark <stark@mit.edu> napsal:
>>
>> On Sun, 26 Mar 2023 at 07:34, Julien Rouhaud <rjuju123@gmail.com> wrote:
>> >
>> > This feature can significantly increase log size, so it's disabled by default.
>> > For testing or development environments it's recommended to enable it if you
>> > use session variables.
>>
>> I think it's generally not practical to have warnings for valid DML.
>> Effectively warnings in DML are errors since they make the syntax just
>> unusable. I suppose it's feasible to have it as a debugging option
>> that defaults to off but I'm not sure it's really useful.
>
>
> It is a tool that should help with collision detection.  Without it, it can be pretty hard to detect it. It is
similarto plpgsql's extra warnings. 

Another example is escape_string_warning, which can also emit warning
for valid DML.  I once had to fix some random framework that a
previous employer was using, in order to move to a more recent pg
version and have standard_conforming_strings on, and having
escape_string_warning was quite helpful.



Re: Schema variables - new implementation for Postgres 15

From
Kirk Wolak
Date:
On Wed, Apr 5, 2023 at 1:58 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


st 5. 4. 2023 v 19:20 odesílatel Greg Stark <stark@mit.edu> napsal:
On Sun, 26 Mar 2023 at 07:34, Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> This feature can significantly increase log size, so it's disabled by default.
> For testing or development environments it's recommended to enable it if you
> use session variables.

I think it's generally not practical to have warnings for valid DML.
Effectively warnings in DML are errors since they make the syntax just
unusable. I suppose it's feasible to have it as a debugging option
that defaults to off but I'm not sure it's really useful.

It is a tool that should help with collision detection.  Without it, it can be pretty hard to detect it. It is similar to plpgsql's extra warnings.


I suppose it raises the question of whether session variables should
be in pg_class and be in the same namespace as tables so that
collisions are impossible. I haven't looked at the code to see if
that's feasible or reasonable. But this feels a bit like what happened
with sequences where they used to be a wholly special thing and later
we realized everything was simpler if they were just a kind of
relation.

The first patch did it. But at the end, it doesn't reduce conflicts, because usually the conflicts are between variables and table's attributes (columns).

example

create variable a as int;
create table foo(a int);

select a from foo; -- the "a" is ambiguous, variable "a" is shadowed

This is a basic case, and the unique names don't help. The variables are more aggressive in namespace than tables, because they don't require be in FROM clause. This is the reason why we specify so variables are always shadowed. Only this behaviour is safe and robust. I cannot break any query (that doesn't use variables) by creating any variable. On second hand, an experience from Oracle's PL/SQL or from old PLpgSQL is, so unwanted shadowing can be hard to investigate (without some tools).

PL/pgSQL doesn't allow conflict between PL/pgSQL variables, and SQL (now), and I think so it is best. But the scope of PLpgSQL variables is relatively small, so very strict behaviour is acceptable.

The session variables are some between tables and attributes. The catalog pg_class can be enhanced about columns for variables, but it does a lot now, so I think it is not practical.

I agree about shadowing schema variables.  But is there no way to fix that so that you can dereference the variable?
[Does an Alias work inside a procedure against a schema var?]
Does adding a schema prefix resolve it  properly, so your example, I could do:
SELECT schema_var.a AS var_a, a as COL_A from t;

Again, I like the default that it is hidden, but I can envision needing both?

Regards, Kirk

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:

example

create variable a as int;
create table foo(a int);

select a from foo; -- the "a" is ambiguous, variable "a" is shadowed

This is a basic case, and the unique names don't help. The variables are more aggressive in namespace than tables, because they don't require be in FROM clause. This is the reason why we specify so variables are always shadowed. Only this behaviour is safe and robust. I cannot break any query (that doesn't use variables) by creating any variable. On second hand, an experience from Oracle's PL/SQL or from old PLpgSQL is, so unwanted shadowing can be hard to investigate (without some tools).

PL/pgSQL doesn't allow conflict between PL/pgSQL variables, and SQL (now), and I think so it is best. But the scope of PLpgSQL variables is relatively small, so very strict behaviour is acceptable.

The session variables are some between tables and attributes. The catalog pg_class can be enhanced about columns for variables, but it does a lot now, so I think it is not practical.

I agree about shadowing schema variables.  But is there no way to fix that so that you can dereference the variable?
[Does an Alias work inside a procedure against a schema var?]
Does adding a schema prefix resolve it  properly, so your example, I could do:
SELECT schema_var.a AS var_a, a as COL_A from t;

Yes, using schema can fix collisions in almost all cases. There are some possible cases, when the schema name is the same as some variable name, and in these cases there can still be collisions (and still there is a possibility to use catalog.schema.object and it can fix a collision). You can use a qualified identifier and again in most cases it fixes collisions. These cases are tested in regression tests.

Regards

Pavel


Again, I like the default that it is hidden, but I can envision needing both?

Regards, Kirk

Re: Schema variables - new implementation for Postgres 15

From
Kirk Wolak
Date:
On Thu, Mar 30, 2023 at 4:06 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

ne 26. 3. 2023 v 19:44 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Fri, Mar 24, 2023 at 08:04:08AM +0100, Pavel Stehule wrote:
> čt 23. 3. 2023 v 19:54 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
> napsal:
>
> > čt 23. 3. 2023 v 16:33 odesílatel Peter Eisentraut <
> > peter.eisentraut@enterprisedb.com> napsal:
> >
> >> The other issue is that by its nature this patch adds a lot of code in a
> >> lot of places.  Large patches are more likely to be successful if they
...
I agree, the patch scale is a bit overwhelming. It's worth noting that
due to the nature of this change certain heavy lifting has to be done in
any case, plus I've got an impression that some part of the patch are
quite solid (although I haven't reviewed everything, did anyone achieve
that milestone?). But still, it would be of great help to simplify the
current implementation, and I'm afraid the only way of doing this is to
make trades-off about functionality vs change size & complexity.

There is not too much space for reduction - more - sometimes there is code reuse between features.

I can reduce temporary session variables, but the same AtSubXact routines are used by memory purging routines, and if only if  you drop all dependent features, then you can get some interesting number of reduced lines. I can imagine very reduced feature set like

1) no temporary variables, no reset at transaction end
2) without default expressions - default is null
3) direct memory cleaning on drop (without possibility of saved value after reverted drop) or cleaning at session end always

Note - @1 and @3 shares code

Please don't remove #2.  With Default Values, I was eyeballing these as pseudo constants.  I find I have a DRY (Don't Repeat Yourself) issue in our current code base (PLPGSQL) because of the lack of shared constants throughout the application layer.  We literally created a CONST schema with SQL functions that return a set value.  It's kludgy, but clear enough.  (We have approximately 50 of these).

Regards, Kirk

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

čt 30. 3. 2023 v 15:40 odesílatel Peter Eisentraut <peter.eisentraut@enterprisedb.com> napsal:
On 30.03.23 10:49, Pavel Stehule wrote:
> If I reorganize the patch to the following structure, can be it useful
> for you?
>
> 1. really basic functionality (no temporary variables, no def
> expressions, no memory cleaning)
>     SELECT variable
>     LET should be supported + doc, + related tests.
>
> 2. support for temporary variables (session, transaction scope),
>      memory cleaning at the end of transaction
>
> 3. PL/pgSQL support
> 4. pg_dump
> 5. shadowing warning
> 6. ... others ...

I am sending a refactorized patch. Mainly I rewrote memory cleaning - now it should be more robust and more simple (no more mem alloc in sinval handler).  Against the previous patch, only the syntax "LET var = DEFAULT" is not supported. I don't think it should be supported now. These patches are incremental - every patch contains related doc, regress tests and can be tested incrementally.

New organization

1. basic CREATE VARIABLE, DROP VARIABLE, GRANT, REVOKE, ALTER, pg_dump
2. basic SELECT var, LET var = value
3. DISCARD VARIABLES
4. cleaning memory used by dropped variables
5. temporary variables + ON COMMIT DROP clause support
6. ON TRANSACTION END RESET clause support
7. DEFAULT expr clause support
8. support NOT NULL and IMMUTABLE clauses
9. use message "column or variable doesn't exists" instead "column doesn't exists"

Regards

Pavel

 

That seems like an ok approach.  The pg_dump support should probably go
into the first patch, so it's self-contained.
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:

Hi

New organization

1. basic CREATE VARIABLE, DROP VARIABLE, GRANT, REVOKE, ALTER, pg_dump
2. basic SELECT var, LET var = value
3. DISCARD VARIABLES
4. cleaning memory used by dropped variables
5. temporary variables + ON COMMIT DROP clause support
6. ON TRANSACTION END RESET clause support
7. DEFAULT expr clause support
8. support NOT NULL and IMMUTABLE clauses
9. use message "column or variable doesn't exists" instead "column doesn't exists"

Regards

Pavel

 

fix tests and meson test configuration

Regards

Pavel

 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Thu, Aug 03, 2023 at 08:15:13AM +0200, Pavel Stehule wrote:
> Hi
>
> fresh rebase

Thanks for continuing efforts. The new patch structure looks better to
me (although the boundary between patches 0001 and 0002 is somewhat
fuzzy, e.g. the function NameListToString is used already in the first
one, but defined in the second). Couple of commentaries along the way:

* Looks like it's common to use BKI_DEFAULT when defining catalog
entities, something like BKI_DEFAULT(-1) for typmod, BKI_DEFAULT(0) for
collation, etc. Does it make sense to put few default values into
pg_variable as well?

* The first patch contains:

    diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
    @@ -2800,6 +2800,8 @@ AbortTransaction(void)
        AtAbort_Portals();
        smgrDoPendingSyncs(false, is_parallel_worker);
        AtEOXact_LargeObject(false);
    +
    +    /* 'false' means it's abort */
        AtAbort_Notify();
        AtEOXact_RelationMap(false, is_parallel_worker);
        AtAbort_Twophase();

What does the commentary refer to, is it needed?

* I see ExplainOneQuery got a new argument:

     static void ExplainOneQuery(Query *query, int cursorOptions,
    -                            IntoClause *into, ExplainState *es,
    +                            IntoClause *into, Oid targetvar, ExplainState *es,
                                const char *queryString, ParamListInfo params,
                                QueryEnvironment *queryEnv);

From what I understand it represents a potential session variable to be
explained. Isn't it too specific for this interface, could it be put
somewhere else? To be honest, I don't have any suggestions myself, but
it feels a bit out of place here.

* Session variable validity logic is not always clear, at least to me,
producing following awkward pieces of code:

    +        if (!svar->is_valid)
    +        {
    +            if (is_session_variable_valid(svar))
    +                svar->is_valid = true;

I get it as there are two ways how a variable could be invalid?

* It's not always easy to follow which failure modes are taken care of. E.g.

    +     * Don't try to use possibly invalid data from svar. And we don't want to
    +     * overwrite invalid svar immediately. The datumCopy can fail, and in this
    +     * case, the stored value will be invalid still.

I couldn't find any similar precautions, how exactly datumCopy can fail,
are you referring to palloc/memcpy failures?

Another confusing example was this one at the end of set_session_variable:

    +    /*
    +     * XXX While unlikely, an error here is possible. It wouldn't leak memory
    +     * as the allocated chunk has already been correctly assigned to the
    +     * session variable, but would contradict this function contract, which is
    +     * that this function should either succeed or leave the current value
    +     * untouched.
    +     */
    +    elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new value",
    +         get_namespace_name(get_session_variable_namespace(svar->varid)),
    +         get_session_variable_name(svar->varid),
    +         svar->varid);

It's not clear, which exactly error you're talking about, it's the last
instruction in the function.

Maybe it would be beneficial to have some overarching description, all
in one place, about how session variables implementation handles various
failures?



Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Fri, Aug 11, 2023 at 05:55:26PM +0200, Dmitry Dolgov wrote:
>
> Another confusing example was this one at the end of set_session_variable:
>
>     +    /*
>     +     * XXX While unlikely, an error here is possible. It wouldn't leak memory
>     +     * as the allocated chunk has already been correctly assigned to the
>     +     * session variable, but would contradict this function contract, which is
>     +     * that this function should either succeed or leave the current value
>     +     * untouched.
>     +     */
>     +    elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new value",
>     +         get_namespace_name(get_session_variable_namespace(svar->varid)),
>     +         get_session_variable_name(svar->varid),
>     +         svar->varid);
>
> It's not clear, which exactly error you're talking about, it's the last
> instruction in the function.

FTR I think I'm the one that changed that.  The error I was talking about is
elog() itself (in case of OOM for instance), or even one of the get_* call, if
running with log_level <= DEBUG1.  It's clearly really unlikely but still
possible, thus this comment which also tries to explain why this elog() is not
done earlier.



Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Sat, Aug 12, 2023 at 09:28:19AM +0800, Julien Rouhaud wrote:
> On Fri, Aug 11, 2023 at 05:55:26PM +0200, Dmitry Dolgov wrote:
> >
> > Another confusing example was this one at the end of set_session_variable:
> >
> >     +    /*
> >     +     * XXX While unlikely, an error here is possible. It wouldn't leak memory
> >     +     * as the allocated chunk has already been correctly assigned to the
> >     +     * session variable, but would contradict this function contract, which is
> >     +     * that this function should either succeed or leave the current value
> >     +     * untouched.
> >     +     */
> >     +    elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new value",
> >     +         get_namespace_name(get_session_variable_namespace(svar->varid)),
> >     +         get_session_variable_name(svar->varid),
> >     +         svar->varid);
> >
> > It's not clear, which exactly error you're talking about, it's the last
> > instruction in the function.
>
> FTR I think I'm the one that changed that.  The error I was talking about is
> elog() itself (in case of OOM for instance), or even one of the get_* call, if
> running with log_level <= DEBUG1.  It's clearly really unlikely but still
> possible, thus this comment which also tries to explain why this elog() is not
> done earlier.

I see, thanks for clarification. Absolutely nitpicking, but the crucial
"that's why this elog is not done earlier" is only assumed in the
comment between the lines, not stated out loud :)



Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
On Sat, Aug 12, 2023 at 01:20:03PM +0200, Dmitry Dolgov wrote:
> > On Sat, Aug 12, 2023 at 09:28:19AM +0800, Julien Rouhaud wrote:
> > On Fri, Aug 11, 2023 at 05:55:26PM +0200, Dmitry Dolgov wrote:
> > >
> > > Another confusing example was this one at the end of set_session_variable:
> > >
> > >     +    /*
> > >     +     * XXX While unlikely, an error here is possible. It wouldn't leak memory
> > >     +     * as the allocated chunk has already been correctly assigned to the
> > >     +     * session variable, but would contradict this function contract, which is
> > >     +     * that this function should either succeed or leave the current value
> > >     +     * untouched.
> > >     +     */
> > >     +    elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new value",
> > >     +         get_namespace_name(get_session_variable_namespace(svar->varid)),
> > >     +         get_session_variable_name(svar->varid),
> > >     +         svar->varid);
> > >
> > > It's not clear, which exactly error you're talking about, it's the last
> > > instruction in the function.
> >
> > FTR I think I'm the one that changed that.  The error I was talking about is
> > elog() itself (in case of OOM for instance), or even one of the get_* call, if
> > running with log_level <= DEBUG1.  It's clearly really unlikely but still
> > possible, thus this comment which also tries to explain why this elog() is not
> > done earlier.
>
> I see, thanks for clarification. Absolutely nitpicking, but the crucial
> "that's why this elog is not done earlier" is only assumed in the
> comment between the lines, not stated out loud :)

Well, yes although to be fair the original version of this had a prior comment
that was making it much more obvious:

+   /*
+    * No error should happen after this poiht, otherwise we could leak the
+    * newly allocated value if any.
+    */

(which would maybe have been better said "Nothing that can error out should be
called after that point").  After quite a lot of patch revisions it now simply
says:

+    /* We can overwrite old variable now. No error expected */

I agree that a bit more explanation is needed, and maybe also reminding that
this is because all of that is done in a persistent memory context.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

pá 11. 8. 2023 v 17:58 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Thu, Aug 03, 2023 at 08:15:13AM +0200, Pavel Stehule wrote:
> Hi
>
> fresh rebase

Thanks for continuing efforts. The new patch structure looks better to
me (although the boundary between patches 0001 and 0002 is somewhat
fuzzy, e.g. the function NameListToString is used already in the first
one, but defined in the second). Couple of commentaries along the way:

NameListToString is already buildin function. Do you think NamesFromList?

This is my oversight - there is just `+extern List *NamesFromList(List *names); ` line, but sure - it should be in 0002 patch

fixed now

For all patches I tested the possibility to compile without following patches, but this issue was not reported by the compiler.

First patch is related to the system catalog - so you can create, drop, and backup session variables. Second patch is dedicated to possibility to store and use an value to session variable


* Looks like it's common to use BKI_DEFAULT when defining catalog
entities, something like BKI_DEFAULT(-1) for typmod, BKI_DEFAULT(0) for
collation, etc. Does it make sense to put few default values into
pg_variable as well?

done


* The first patch contains:

    diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
    @@ -2800,6 +2800,8 @@ AbortTransaction(void)
        AtAbort_Portals();
        smgrDoPendingSyncs(false, is_parallel_worker);
        AtEOXact_LargeObject(false);
    +
    +   /* 'false' means it's abort */
        AtAbort_Notify();
        AtEOXact_RelationMap(false, is_parallel_worker);
        AtAbort_Twophase();

What does the commentary refer to, is it needed?

it was wrongly placed, it should be part as patch 0005, but it has not too valuable benefit, so I removed it
 

* I see ExplainOneQuery got a new argument:

     static void ExplainOneQuery(Query *query, int cursorOptions,
    -                                                   IntoClause *into, ExplainState *es,
    +                                                   IntoClause *into, Oid targetvar, ExplainState *es,
                                const char *queryString, ParamListInfo params,
                                QueryEnvironment *queryEnv);

From what I understand it represents a potential session variable to be
explained. Isn't it too specific for this interface, could it be put
somewhere else? To be honest, I don't have any suggestions myself, but
it feels a bit out of place here.

The target session variable is pushed there to be used for creating VariableDestReceiver, that is necessary for workable LET command when EXPLAIN is used with ANALYZE clause.

I reduced the changes now, but there should be still because the target session variable should be pushed to ExplainOnePlan, but PlannedStmt has not any access to the Query structure where the resultVariable is stored. But I need to inject only ExplainOnePlan - no others. This is the same reason why ExplainOnePlan has an "into" argument. In other places I can use the resultVariable from the "query" argument.

* Session variable validity logic is not always clear, at least to me,
producing following awkward pieces of code:

    +           if (!svar->is_valid)
    +           {
    +                   if (is_session_variable_valid(svar))
    +                           svar->is_valid = true;

I get it as there are two ways how a variable could be invalid?

The flag is_valid is set by sinval message processing or by DROP VARIABLE command.

All invalid variables should be removed by remove_invalid_session_variables function, but this function ignores variables dropped in the current transaction (and this routine is called only once per transaction - it can be expensive, because it iterates over all variables currently used in session). The purpose of remove_invalid_session_variables inside get_session_variable is cleaning memory for dropped variables when the previous transaction is aborted.

But there is a possibility to revert DROP VARIABLE by using savepoint inside one transaction. And in this case we can have invalid variable (after DROP VARIABLE), that is not removed by remove_invalid_session_variables, but can be valid (and it is validated after is_session_variable_valid).

This is reggress test scenario

BEGIN;
  CREATE TEMP VARIABLE var1 AS int ON COMMIT DROP;
  LET var1 = 100;
  SAVEPOINT s1;
  DROP VARIABLE var1;
  ROLLBACK TO s1;
  SELECT var1;
 var1.
------
  100
(1 row)

COMMIT;

I did new comment there, and modified little bit the logic

attention: the logic is different before and after patch 0004 where memory cleaning is implemented



* It's not always easy to follow which failure modes are taken care of. E.g.

    +    * Don't try to use possibly invalid data from svar. And we don't want to
    +    * overwrite invalid svar immediately. The datumCopy can fail, and in this
    +    * case, the stored value will be invalid still.

This comment is related to usage of svar->typbyval and svar->typbylen for datumCopy. When we accept invalidation message
for some variable and then svar->is_valid is false, then we should not use these values, and we should reread it from catalog
(be executing setup_session_variable). It is done on auxiliary svar, because there is a possible risk of failure of datumCopy, and the
contract is unchanged passed svar, when any error happens.

I changed the comment.


I couldn't find any similar precautions, how exactly datumCopy can fail,
are you referring to palloc/memcpy failures?

I expected only palloc failure.
 

Another confusing example was this one at the end of set_session_variable:

    +   /*
    +    * XXX While unlikely, an error here is possible. It wouldn't leak memory
    +    * as the allocated chunk has already been correctly assigned to the
    +    * session variable, but would contradict this function contract, which is
    +    * that this function should either succeed or leave the current value
    +    * untouched.
    +    */
    +   elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new value",
    +            get_namespace_name(get_session_variable_namespace(svar->varid)),
    +            get_session_variable_name(svar->varid),
    +            svar->varid);

It's not clear, which exactly error you're talking about, it's the last
instruction in the function.

Maybe it would be beneficial to have some overarching description, all
in one place, about how session variables implementation handles various
failures?

Currently, there are only two places where there can be some failure - one is related to set and datumCopy, a second to evaluation of default expressions. 

Any other possible failures like domain's exception or not null exception has not any impact on stored value.

regards

Pavel

 
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:

Hi

When I thought about global temporary tables, I got one maybe interesting idea. The one significant problem of global temporary tables is place for storing info about size or column statistics.

I think so these data can be stored simply in session variables. Any global temporary table can get assigned one session variable, that can hold these data.

Regards

Pavel




Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Wed, Aug 23, 2023 at 04:02:44PM +0200, Pavel Stehule wrote:
> NameListToString is already buildin function. Do you think NamesFromList?
>
> This is my oversight - there is just `+extern List *NamesFromList(List
> *names); ` line, but sure - it should be in 0002 patch
>
> fixed now

Right, thanks for fixing.

I think there is a wrinkle with pg_session_variables function. It
returns nothing if sessionvars hash table is empty, which has two
consequences:

* One might get confused about whether a variable is created,
  based on the information from the function. An expected behaviour, but
  could be considered a bad UX.

    =# CREATE VARIABLE var1 AS varchar;

    -- empty, is expected
    =# SELECT name, typname, can_select, can_update FROM pg_session_variables();
     name | typname | can_select | can_update
     ------+---------+------------+------------
     (0 rows)

    -- but one can't create a variable
    =# CREATE VARIABLE var1 AS varchar;
    ERROR:  42710: session variable "var1" already exists
    LOCATION:  create_variable, pg_variable.c:102

    -- yet, suddenly after a select...
    =# SELECT var2;
     var2
     ------
      NULL
      (1 row)

    -- ... it's not empty
    =# SELECT name, typname, can_select, can_update FROM pg_sessio
    n_variables();
     name |      typname      | can_select | can_update
     ------+-------------------+------------+------------
      var2 | character varying | t          | t
      (1 row)

* Running a parallel query will end up returning an empty result even
  after accessing the variable.

    -- debug_parallel_query = 1 all the time
    =# CREATE VARIABLE var2 AS varchar;

    -- empty, is expected
    =# SELECT name, typname, can_select, can_update FROM pg_session_variables();
     name | typname | can_select | can_update
     ------+---------+------------+------------
     (0 rows)

    -- but this time an access...
    SELECT var2;
     var2
     ------
      NULL
      (1 row)

    -- or set...
    =# LET var2 = 'test';

    -- doesn't change the result, it's still empty
    =# SELECT name, typname, can_select, can_update FROM pg_session_variables();
     name | typname | can_select | can_update
     ------+---------+------------+------------
     (0 rows)

Would it be a problem to make pg_session_variables inspect the catalog
or something similar if needed?



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

pá 17. 11. 2023 v 20:17 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Wed, Aug 23, 2023 at 04:02:44PM +0200, Pavel Stehule wrote:
> NameListToString is already buildin function. Do you think NamesFromList?
>
> This is my oversight - there is just `+extern List *NamesFromList(List
> *names); ` line, but sure - it should be in 0002 patch
>
> fixed now

Right, thanks for fixing.

I think there is a wrinkle with pg_session_variables function. It
returns nothing if sessionvars hash table is empty, which has two
consequences:

* One might get confused about whether a variable is created,
  based on the information from the function. An expected behaviour, but
  could be considered a bad UX.

    =# CREATE VARIABLE var1 AS varchar;

    -- empty, is expected
    =# SELECT name, typname, can_select, can_update FROM pg_session_variables();
     name | typname | can_select | can_update
     ------+---------+------------+------------
     (0 rows)

    -- but one can't create a variable
    =# CREATE VARIABLE var1 AS varchar;
    ERROR:  42710: session variable "var1" already exists
    LOCATION:  create_variable, pg_variable.c:102

    -- yet, suddenly after a select...
    =# SELECT var2;
     var2
     ------
      NULL
      (1 row)

    -- ... it's not empty
    =# SELECT name, typname, can_select, can_update FROM pg_sessio
    n_variables();
     name |      typname      | can_select | can_update
     ------+-------------------+------------+------------
      var2 | character varying | t          | t
      (1 row)

* Running a parallel query will end up returning an empty result even
  after accessing the variable.

    -- debug_parallel_query = 1 all the time
    =# CREATE VARIABLE var2 AS varchar;

    -- empty, is expected
    =# SELECT name, typname, can_select, can_update FROM pg_session_variables();
     name | typname | can_select | can_update
     ------+---------+------------+------------
     (0 rows)

    -- but this time an access...
    SELECT var2;
     var2
     ------
      NULL
      (1 row)

    -- or set...
    =# LET var2 = 'test';

    -- doesn't change the result, it's still empty
    =# SELECT name, typname, can_select, can_update FROM pg_session_variables();
     name | typname | can_select | can_update
     ------+---------+------------+------------
     (0 rows)

Would it be a problem to make pg_session_variables inspect the catalog
or something similar if needed?

It can be very easy to build pg_session_variables based on iteration over the system catalog. But I am not sure if we want it. pg_session_variables() is designed to show the variables from session memory, and it is used for testing. Originally it was named pg_debug_session_variables. If we iterate over catalog, it means using locks, and it can have an impact on isolation tests.

So maybe we can introduce a parameter for this function to show all session variables (based on catalog) or only used based on iteration over memory. Default can be "all". What do you think about it?

The difference between debug_parallel_query = 1 and debug_parallel_query = 0 is strange - and I'll check it. 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 18. 11. 2023 v 14:19 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

pá 17. 11. 2023 v 20:17 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Wed, Aug 23, 2023 at 04:02:44PM +0200, Pavel Stehule wrote:
> NameListToString is already buildin function. Do you think NamesFromList?
>
> This is my oversight - there is just `+extern List *NamesFromList(List
> *names); ` line, but sure - it should be in 0002 patch
>
> fixed now

Right, thanks for fixing.

I think there is a wrinkle with pg_session_variables function. It
returns nothing if sessionvars hash table is empty, which has two
consequences:

* One might get confused about whether a variable is created,
  based on the information from the function. An expected behaviour, but
  could be considered a bad UX.

    =# CREATE VARIABLE var1 AS varchar;

    -- empty, is expected
    =# SELECT name, typname, can_select, can_update FROM pg_session_variables();
     name | typname | can_select | can_update
     ------+---------+------------+------------
     (0 rows)

    -- but one can't create a variable
    =# CREATE VARIABLE var1 AS varchar;
    ERROR:  42710: session variable "var1" already exists
    LOCATION:  create_variable, pg_variable.c:102

    -- yet, suddenly after a select...
    =# SELECT var2;
     var2
     ------
      NULL
      (1 row)

    -- ... it's not empty
    =# SELECT name, typname, can_select, can_update FROM pg_sessio
    n_variables();
     name |      typname      | can_select | can_update
     ------+-------------------+------------+------------
      var2 | character varying | t          | t
      (1 row)

* Running a parallel query will end up returning an empty result even
  after accessing the variable.

    -- debug_parallel_query = 1 all the time
    =# CREATE VARIABLE var2 AS varchar;

    -- empty, is expected
    =# SELECT name, typname, can_select, can_update FROM pg_session_variables();
     name | typname | can_select | can_update
     ------+---------+------------+------------
     (0 rows)

    -- but this time an access...
    SELECT var2;
     var2
     ------
      NULL
      (1 row)

    -- or set...
    =# LET var2 = 'test';

    -- doesn't change the result, it's still empty
    =# SELECT name, typname, can_select, can_update FROM pg_session_variables();
     name | typname | can_select | can_update
     ------+---------+------------+------------
     (0 rows)

Would it be a problem to make pg_session_variables inspect the catalog
or something similar if needed?

It can be very easy to build pg_session_variables based on iteration over the system catalog. But I am not sure if we want it. pg_session_variables() is designed to show the variables from session memory, and it is used for testing. Originally it was named pg_debug_session_variables. If we iterate over catalog, it means using locks, and it can have an impact on isolation tests.

So maybe we can introduce a parameter for this function to show all session variables (based on catalog) or only used based on iteration over memory. Default can be "all". What do you think about it?

The difference between debug_parallel_query = 1 and debug_parallel_query = 0 is strange - and I'll check it. 

looks so  pg_session_variables() doesn't work  in debug_paralel_query mode.


Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:




The difference between debug_parallel_query = 1 and debug_parallel_query = 0 is strange - and I'll check it. 

looks so  pg_session_variables() doesn't work  in debug_paralel_query mode.

It is marked as parallel safe, what is probably nonsense.


Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Sat, Nov 18, 2023 at 02:19:09PM +0100, Pavel Stehule wrote:
> > Would it be a problem to make pg_session_variables inspect the catalog
> > or something similar if needed?
> >
>
> It can be very easy to build pg_session_variables based on iteration over
> the system catalog. But I am not sure if we want it. pg_session_variables()
> is designed to show the variables from session memory, and it is used for
> testing. Originally it was named pg_debug_session_variables. If we iterate
> over catalog, it means using locks, and it can have an impact on isolation
> tests.

I see, thanks for clarification. In the end one can check the catalog
directly of course, is there any other value in this function except for
debugging purposes?

As a side note, I'm intended to go one more time through the first few
patches introducing the basic functionality, and then mark it as ready
in CF. I can't break the patch in testing since quite long time, and for
most parts the changes make sense to me.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 18. 11. 2023 v 15:54 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Sat, Nov 18, 2023 at 02:19:09PM +0100, Pavel Stehule wrote:
> > Would it be a problem to make pg_session_variables inspect the catalog
> > or something similar if needed?
> >
>
> It can be very easy to build pg_session_variables based on iteration over
> the system catalog. But I am not sure if we want it. pg_session_variables()
> is designed to show the variables from session memory, and it is used for
> testing. Originally it was named pg_debug_session_variables. If we iterate
> over catalog, it means using locks, and it can have an impact on isolation
> tests.

I see, thanks for clarification. In the end one can check the catalog
directly of course, is there any other value in this function except for
debugging purposes?

I have no idea how it can be used for different purposes.  Theoretically it can be used to check if some variable was used (initialized) in a session already. But for this purpose it is not too practical, and if there will be some request for this functionality, then we can write a special function for this purpose. But I don't know any actual use cases for this.


As a side note, I'm intended to go one more time through the first few
patches introducing the basic functionality, and then mark it as ready
in CF. I can't break the patch in testing since quite long time, and for
most parts the changes make sense to me.

Thank you very much, for testing, comments, and all other work.

I marked pg_session_variables function as PARALLEL RESTRICTED, and did rebase

Regards

Pavel


Attachment

Re: Schema variables - new implementation for Postgres 15

From
Julien Rouhaud
Date:
Hi,

On Tue, Oct 17, 2023 at 08:52:13AM +0200, Pavel Stehule wrote:
>
> When I thought about global temporary tables, I got one maybe interesting
> idea. The one significant problem of global temporary tables is place for
> storing info about size or column statistics.
>
> I think so these data can be stored simply in session variables. Any global
> temporary table can get assigned one session variable, that can hold these
> data.

I don't know how realistic this would be.  For instance it will require to
properly link the global temporary table life cycle with the session variable
and I'm afraid it would require to add some hacks to make it work as needed.

But this still raises the question of whether this feature could be used
internally for the need of another feature.  If we think it's likely, should we
try to act right now and reserve the "pg_" prefix for internal use rather than
do that a few years down the line and probably break some user code as it was
done recently for the role names?



Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Sat, Nov 18, 2023 at 06:28:53PM +0100, Pavel Stehule wrote:
> so 18. 11. 2023 v 15:54 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> napsal:
> > As a side note, I'm intended to go one more time through the first few
> > patches introducing the basic functionality, and then mark it as ready
> > in CF. I can't break the patch in testing since quite long time, and for
> > most parts the changes make sense to me.
>
> I marked pg_session_variables function as PARALLEL RESTRICTED, and did
> rebase

So, after one week of uninterrupted evening reviews I've made it through
the first four patches :)

It's a decent job -- more than once, looking at the code, I thought I
could construct a case when it's going to blow up, but everything was
working just fine. Yet, I think the patch still has to be reshaped a bit
before moving forward. I've got a couple proposals of different nature:
high level changes (you probably won't like some of them, but I'm sure
they're going to be useful), technical code-level improvements/comments,
and few language changes. With those changes in mind I would be
satisfied with the patch, and hopefully they would also make it easier
for a potential committer to pick it up.

# High level proposals

* I would suggest reducing the scope of the patch as much as possible,
  and not just by trimming on the edges, but rather following Phileas
  Fogg's example with the steamboat Henrietta -- get rid of all
  non-essential parts. This will make this rather large patch more
  approachable for others.

  For that one can concentrate only on the first two patches plus the
  fourth one (memory cleanup after dropping variables), leaving DISCARD,
  ON TRANSACTION END, DEFAULT, IMMUTABLE for the follow-up in the
  future.

  Another thing in this context would be to evaluate plpgsql support for
  this feature. You know the use case better than me, how important it
  is? Is it an intrinsic part of the feature, or session variables could
  be still valuable enough even without plpgsql? From what I see
  postponing plgpsql will make everything about ~800 lines lighter (most
  likely more), and also allow to ignore couple of concerns about the
  implementation (about this later).

* The new GUC session_variables_ambiguity_warning is definitely going to
  cause many objections, it's another knob to manage very subtle
  behaviour detail very few people will ever notice. I see the point
  behind warning about ambiguity, so probably it makes sense to bite the
  bullet and decide one way or another. The proposal is to warn always
  in potentially ambiguous situations, and if concerns are high about
  logging too much, maybe do the warning on lower logging levels.

# Code-level observations

* It feels a bit awkward to have varid assignment logic in a separate
  function, what about adding an argument with varid to
  CreateVariableDestReceiver? SetVariableDestReceiverVarid still could
  be used for CreateDestReceiver.

    /*
     * Initially create a DestReceiver object.
     */
    DestReceiver *
    CreateVariableDestReceiver(void)

    /*
     * Set parameters for a VariableDestReceiver.
     * Should be called right after creating the DestReceiver.
     */
    void
    SetVariableDestReceiverVarid(DestReceiver *self, Oid varid)

* It's worth it to add a commentary here explaining why it's fine to use
  InvalidOid here:

     if (pstmt->commandType != CMD_UTILITY)
-           ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+           ExplainOnePlan(pstmt, into, InvalidOid, es, query_string, paramLI, queryEnv,
                           &planduration, (es->buffers ? &bufusage : NULL));

  My understanding is that since LetStmt is CMD_UTILITY, this branch
  will never be visited for a session variable.

* IIUC this one is introduced to exclude session variables from the normal
  path with EXPR_KIND_UPDATE_TARGET:

+   EXPR_KIND_ASSIGN_VARIABLE,      /* PL/pgSQL assignment target - disallow
+                                                            * session variables */

  But the name doesn't sound right, maybe longer
  EXPR_KIND_UPDATE_TARGET_NO_VARS is better?

* I'm curious about this one, which exactly part does this change cover?

@@ -4888,21 +4914,43 @@ substitute_actual_parameters_mutator(Node *node,
-               if (param->paramkind != PARAM_EXTERN)
+               if (param->paramkind != PARAM_EXTERN &&
+                       param->paramkind != PARAM_VARIABLE)
                        elog(ERROR, "unexpected paramkind: %d", (int) param->paramkind);

  I've commented it out, but no tests were affected.

* Does it mean there could be theoretically two LET statements at the
  same time with different command type, one CMD_UTILITY, one
  CMD_SELECT? Can it cause any issues?

+       /*
+        * Inside PL/pgSQL we don't want to execute LET statement as utility
+        * command, because it disallow to execute expression as simple
+        * expression. So for PL/pgSQL we have extra path, and we return SELECT.
+        * Then it can be executed by exec_eval_expr. Result is dirrectly assigned
+        * to target session variable inside PL/pgSQL LET statement handler. This
+        * is extra code, extra path, but possibility to get faster execution is
+        * too attractive.
+        */
+       if (stmt->plpgsql_mode)
+               return query;
+

* This probably requires more explanation, is warning the only reason
  for this change?

+        *
+        * The session variables should not be used as target of PL/pgSQL assign
+        * statement. So we should to use special parser expr kind, that disallow
+        * usage of session variables. This block unwanted (in this context)
+        * possible warning so target PL/pgSQL's variable shadows some session
+        * variable.
         */
        target = transformExpr(pstate, (Node *) cref,
-                                                  EXPR_KIND_UPDATE_TARGET);
+                                                  EXPR_KIND_ASSIGN_VARIABLE);

* It would be great to have more commentaries here:

    typedef struct
    {
        DestReceiver pub;
        Oid            varid;
        Oid            typid;
        int32        typmod;
        int            typlen;
        int            slot_offset;
        int            rows;
    } SVariableState;

  For example, why does it make sense to have a field rows, where we
  interested to only know the fact that there is exactly one column?

* Why there is SetSessionVariableWithSecurityCheck, but no
  GetSessionVariableWithSecurityCheck? Instead, object_aclcheck is done
  in standard_ExecutorStart, which looks a bit out of place.

* pg_session_variables -- you mention it exists only for testing. What
  about moving it out into a separate patch for the sake of slimming
  down? It looks like it's used only in tests for "memory cleanup"
  patch, maybe they could be restructured to not require this function.

* Probably it's time to drop unnecessary historical notes, like this:

 * Note: originally we enhanced a list xact_recheck_varids here. Unfortunately
 * it was not safe and a little bit too complex, because the sinval callback
 * function can be called when we iterate over xact_recheck_varids list.
 * Another issue was the possibility of being out of memory when we enhanced
 * the list. So now we just switch flag in related entry sessionvars hash table.
 * We need to iterate over hash table on every sinval message, so extra two
 * iteration over this hash table is not significant overhead (and we skip
 * entries that don't require recheck). Now we do not have any memory allocation
 * in the sinval handler (This note can be removed before commit).

* The second patch "Storage for session variables and SQL interface",
  mentions DISCARD command:

    /*
     * There is no guarantee of sessionvars being initialized, even when
     * receiving an invalidation callback, as DISCARD [ ALL | VARIABLES ]
     * destroys the hash table entirely.
     */

  This command is implemented in another patch later one, so this
  comment probably belong there.

* This comment mentions a "direct access, without buffering":

    /*
     * Direct access to session variable (without buffering). Because
     * returned value can be used (without an assignement) after the
     * referenced session variables is updated, we have to use an copy
     * of stored value every time.
     */
    *op->resvalue = GetSessionVariableWithTypeCheck(op->d.vparam.varid,
                                                    op->resnull,
                                                    op->d.vparam.vartype);

  But GetSessionVariableWithTypeCheck goes through get_session_variable
  and searches in the hash table. What "buffering" means in this
  context?

* GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid expected_typid)

  Should the "WithTypeCheck" part be an argument of the
  GetSessionVariable? To reduce the code duplication a bit.

* Just out of curiosity, why TopTransactionContext?

    /*
     * Store domain_check extra in TopTransactionContext. When we are in
     * other transaction, the domain_check_extra cache is not valid
     * anymore.
     */
    if (svar->domain_check_extra_lxid != MyProc->lxid)
        svar->domain_check_extra = NULL;

    domain_check(svar->value, svar->isnull,
                 svar->typid, &svar->domain_check_extra,
                 TopTransactionContext);

* In SVariableData it would be great to have more comments around
  freeval, domain_check_extra, domain_check_extra_lxid.

* Nitpicking, but the term "shadowing" for ambiguity between a session
  variable and a table column might be confusing, one can imagine there
  is a connection between those two objects and one actively follows
  ("shadows") the other one.

* The second patch "Storage for session variables and SQL interface"
  mentions in the documentation default and temporary variables:

   <para>
    The value of a session variable is local to the current session. Retrieving
    a variable's value returns either a <literal>NULL</literal> or a default
    value, unless its value has been set to something else in the current
    session using the <command>LET</command> command. The content of a variable
    is not transactional. This is the same as regular variables in PL languages.
    The session variables can be persistent or can be temporary. In both cases,
    the content of session variables is temporary and not shared (like an
    content of temporary tables).
   </para>

  They're implemented in the following patches, so it belongs there.

* Nitpicking, maybe merge those two conditions together for readability?

    if (!needs_validation)
        return;

    /*
     * Reset, this flag here, before we start the validation. It can be set to
     * on by incomming sinval message.
     */
    needs_validation = false;

    if (!sessionvars)
        return;

* This one is not very clear, what is the difference between "somewhere
  inside a transaction" and "at the end of a transaction"?

   /*
    * This routine can be called somewhere inside transaction or at an transaction
    * end. When atEOX argument is false, then we are inside transaction, and we
    * don't want to throw entries related to session variables dropped in current
    * transaction.
    */

# Language topic

Since this patch introduces a large body of documentation and
commentaries, I think it would benefit from a native speaker review.
I've stumbled upon few examples (attached with proposed wording, without
a diff extension to not confuse the CF bot), but otherwise if anyone
follows this thread, texts review is appreciated.

Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

st 22. 11. 2023 v 7:20 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Tue, Oct 17, 2023 at 08:52:13AM +0200, Pavel Stehule wrote:
>
> When I thought about global temporary tables, I got one maybe interesting
> idea. The one significant problem of global temporary tables is place for
> storing info about size or column statistics.
>
> I think so these data can be stored simply in session variables. Any global
> temporary table can get assigned one session variable, that can hold these
> data.

I don't know how realistic this would be.  For instance it will require to
properly link the global temporary table life cycle with the session variable
and I'm afraid it would require to add some hacks to make it work as needed.

But this still raises the question of whether this feature could be used
internally for the need of another feature.  If we think it's likely, should we
try to act right now and reserve the "pg_" prefix for internal use rather than
do that a few years down the line and probably break some user code as it was
done recently for the role names?

I don't think it is necessary. Session variables (in this design) are joined with schemas. If we use some session variables for system purposes, we can use some dedicated schema. But when I think about it in detail, probably my own dedicated storage (hash table in session memory) can be much better than session variables. What can be shared (maybe) is probably sinval message processing.

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

ne 26. 11. 2023 v 18:56 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Sat, Nov 18, 2023 at 06:28:53PM +0100, Pavel Stehule wrote:
> so 18. 11. 2023 v 15:54 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> napsal:
> > As a side note, I'm intended to go one more time through the first few
> > patches introducing the basic functionality, and then mark it as ready
> > in CF. I can't break the patch in testing since quite long time, and for
> > most parts the changes make sense to me.
>
> I marked pg_session_variables function as PARALLEL RESTRICTED, and did
> rebase

So, after one week of uninterrupted evening reviews I've made it through
the first four patches :)

It's a decent job -- more than once, looking at the code, I thought I
could construct a case when it's going to blow up, but everything was
working just fine. Yet, I think the patch still has to be reshaped a bit
before moving forward. I've got a couple proposals of different nature:
high level changes (you probably won't like some of them, but I'm sure
they're going to be useful), technical code-level improvements/comments,
and few language changes. With those changes in mind I would be
satisfied with the patch, and hopefully they would also make it easier
for a potential committer to pick it up.

# High level proposals

* I would suggest reducing the scope of the patch as much as possible,
  and not just by trimming on the edges, but rather following Phileas
  Fogg's example with the steamboat Henrietta -- get rid of all
  non-essential parts. This will make this rather large patch more
  approachable for others.

  For that one can concentrate only on the first two patches plus the
  fourth one (memory cleanup after dropping variables), leaving DISCARD,
  ON TRANSACTION END, DEFAULT, IMMUTABLE for the follow-up in the
  future.

  Another thing in this context would be to evaluate plpgsql support for
  this feature. You know the use case better than me, how important it
  is? Is it an intrinsic part of the feature, or session variables could
  be still valuable enough even without plpgsql? From what I see
  postponing plgpsql will make everything about ~800 lines lighter (most
  likely more), and also allow to ignore couple of concerns about the
  implementation (about this later).

* The new GUC session_variables_ambiguity_warning is definitely going to
  cause many objections, it's another knob to manage very subtle
  behaviour detail very few people will ever notice. I see the point
  behind warning about ambiguity, so probably it makes sense to bite the
  bullet and decide one way or another. The proposal is to warn always
  in potentially ambiguous situations, and if concerns are high about
  logging too much, maybe do the warning on lower logging levels.

# Code-level observations

* It feels a bit awkward to have varid assignment logic in a separate
  function, what about adding an argument with varid to
  CreateVariableDestReceiver? SetVariableDestReceiverVarid still could
  be used for CreateDestReceiver.

    /*
     * Initially create a DestReceiver object.
     */
    DestReceiver *
    CreateVariableDestReceiver(void)

    /*
     * Set parameters for a VariableDestReceiver.
     * Should be called right after creating the DestReceiver.
     */
    void
    SetVariableDestReceiverVarid(DestReceiver *self, Oid varid)

* It's worth it to add a commentary here explaining why it's fine to use
  InvalidOid here:

     if (pstmt->commandType != CMD_UTILITY)
-           ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+           ExplainOnePlan(pstmt, into, InvalidOid, es, query_string, paramLI, queryEnv,
                           &planduration, (es->buffers ? &bufusage : NULL));

  My understanding is that since LetStmt is CMD_UTILITY, this branch
  will never be visited for a session variable.

* IIUC this one is introduced to exclude session variables from the normal
  path with EXPR_KIND_UPDATE_TARGET:

+   EXPR_KIND_ASSIGN_VARIABLE,      /* PL/pgSQL assignment target - disallow
+                                                            * session variables */

  But the name doesn't sound right, maybe longer
  EXPR_KIND_UPDATE_TARGET_NO_VARS is better?

* I'm curious about this one, which exactly part does this change cover?

@@ -4888,21 +4914,43 @@ substitute_actual_parameters_mutator(Node *node,
-               if (param->paramkind != PARAM_EXTERN)
+               if (param->paramkind != PARAM_EXTERN &&
+                       param->paramkind != PARAM_VARIABLE)
                        elog(ERROR, "unexpected paramkind: %d", (int) param->paramkind);

  I've commented it out, but no tests were affected.

* Does it mean there could be theoretically two LET statements at the
  same time with different command type, one CMD_UTILITY, one
  CMD_SELECT? Can it cause any issues?

+       /*
+        * Inside PL/pgSQL we don't want to execute LET statement as utility
+        * command, because it disallow to execute expression as simple
+        * expression. So for PL/pgSQL we have extra path, and we return SELECT.
+        * Then it can be executed by exec_eval_expr. Result is dirrectly assigned
+        * to target session variable inside PL/pgSQL LET statement handler. This
+        * is extra code, extra path, but possibility to get faster execution is
+        * too attractive.
+        */
+       if (stmt->plpgsql_mode)
+               return query;
+

* This probably requires more explanation, is warning the only reason
  for this change?

+        *
+        * The session variables should not be used as target of PL/pgSQL assign
+        * statement. So we should to use special parser expr kind, that disallow
+        * usage of session variables. This block unwanted (in this context)
+        * possible warning so target PL/pgSQL's variable shadows some session
+        * variable.
         */
        target = transformExpr(pstate, (Node *) cref,
-                                                  EXPR_KIND_UPDATE_TARGET);
+                                                  EXPR_KIND_ASSIGN_VARIABLE);

* It would be great to have more commentaries here:

        typedef struct
        {
                DestReceiver pub;
                Oid            varid;
                Oid            typid;
                int32        typmod;
                int            typlen;
                int            slot_offset;
                int            rows;
        } SVariableState;

  For example, why does it make sense to have a field rows, where we
  interested to only know the fact that there is exactly one column?

* Why there is SetSessionVariableWithSecurityCheck, but no
  GetSessionVariableWithSecurityCheck? Instead, object_aclcheck is done
  in standard_ExecutorStart, which looks a bit out of place.

* pg_session_variables -- you mention it exists only for testing. What
  about moving it out into a separate patch for the sake of slimming
  down? It looks like it's used only in tests for "memory cleanup"
  patch, maybe they could be restructured to not require this function.

* Probably it's time to drop unnecessary historical notes, like this:

 * Note: originally we enhanced a list xact_recheck_varids here. Unfortunately
 * it was not safe and a little bit too complex, because the sinval callback
 * function can be called when we iterate over xact_recheck_varids list.
 * Another issue was the possibility of being out of memory when we enhanced
 * the list. So now we just switch flag in related entry sessionvars hash table.
 * We need to iterate over hash table on every sinval message, so extra two
 * iteration over this hash table is not significant overhead (and we skip
 * entries that don't require recheck). Now we do not have any memory allocation
 * in the sinval handler (This note can be removed before commit).

* The second patch "Storage for session variables and SQL interface",
  mentions DISCARD command:

    /*
     * There is no guarantee of sessionvars being initialized, even when
     * receiving an invalidation callback, as DISCARD [ ALL | VARIABLES ]
     * destroys the hash table entirely.
     */

  This command is implemented in another patch later one, so this
  comment probably belong there.

* This comment mentions a "direct access, without buffering":

        /*
         * Direct access to session variable (without buffering). Because
         * returned value can be used (without an assignement) after the
         * referenced session variables is updated, we have to use an copy
         * of stored value every time.
         */
        *op->resvalue = GetSessionVariableWithTypeCheck(op->d.vparam.varid,
                                                                                                        op->resnull,
                                                                                                        op->d.vparam.vartype);

  But GetSessionVariableWithTypeCheck goes through get_session_variable
  and searches in the hash table. What "buffering" means in this
  context?

* GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid expected_typid)

  Should the "WithTypeCheck" part be an argument of the
  GetSessionVariable? To reduce the code duplication a bit.

* Just out of curiosity, why TopTransactionContext?

        /*
         * Store domain_check extra in TopTransactionContext. When we are in
         * other transaction, the domain_check_extra cache is not valid
         * anymore.
         */
        if (svar->domain_check_extra_lxid != MyProc->lxid)
                svar->domain_check_extra = NULL;

        domain_check(svar->value, svar->isnull,
                                 svar->typid, &svar->domain_check_extra,
                                 TopTransactionContext);

* In SVariableData it would be great to have more comments around
  freeval, domain_check_extra, domain_check_extra_lxid.

* Nitpicking, but the term "shadowing" for ambiguity between a session
  variable and a table column might be confusing, one can imagine there
  is a connection between those two objects and one actively follows
  ("shadows") the other one.

* The second patch "Storage for session variables and SQL interface"
  mentions in the documentation default and temporary variables:

   <para>
    The value of a session variable is local to the current session. Retrieving
    a variable's value returns either a <literal>NULL</literal> or a default
    value, unless its value has been set to something else in the current
    session using the <command>LET</command> command. The content of a variable
    is not transactional. This is the same as regular variables in PL languages.
    The session variables can be persistent or can be temporary. In both cases,
    the content of session variables is temporary and not shared (like an
    content of temporary tables).
   </para>

  They're implemented in the following patches, so it belongs there.

* Nitpicking, maybe merge those two conditions together for readability?

    if (!needs_validation)
        return;

    /*
     * Reset, this flag here, before we start the validation. It can be set to
     * on by incomming sinval message.
     */
    needs_validation = false;

    if (!sessionvars)
        return;

* This one is not very clear, what is the difference between "somewhere
  inside a transaction" and "at the end of a transaction"?

   /*
        * This routine can be called somewhere inside transaction or at an transaction
        * end. When atEOX argument is false, then we are inside transaction, and we
        * don't want to throw entries related to session variables dropped in current
        * transaction.
        */

# Language topic

Since this patch introduces a large body of documentation and
commentaries, I think it would benefit from a native speaker review.
I've stumbled upon few examples (attached with proposed wording, without
a diff extension to not confuse the CF bot), but otherwise if anyone
follows this thread, texts review is appreciated.

Thank you for your review.  Next two weeks I'll not too much time to work on this patch - I have to work on some commercial work, and the week is Prague PgConf, so my reply will be slow. But after these events I'll concentrate on this patch.

Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Sun, Dec 03, 2023 at 06:04:12AM +0100, Pavel Stehule wrote:
>
> Thank you for your review.  Next two weeks I'll not too much time to work
> on this patch - I have to work on some commercial work, and the week is
> Prague PgConf, so my reply will be slow. But after these events I'll
> concentrate on this patch.

No worries, it's fine. Have fun at PGConf!



Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
Thanks for the update, smaller patches looks promising.

Off the list Pavel has mentioned that the first two patches contain a
bare minimum for session variables, so I've reviewed them once more and
suggest to concentrate on them first. I'm afraid the memory cleanup
patch has to be added to the "bare minimum" set as well -- otherwise in
my tests it was too easy to run out of memory via creating, assigning
and dropping variables. Unfortunately one can't extract those three
patches from the series and apply only them, the memory patch would have
some conflicts. Can you maybe reshuffle the series to have those patches
(1, 2 + 8) as first three?

If that's possible, my proposal would be to proceed with them first. To the
best of my knowledge they look good to me, except few minor details:

* The documentation says in a couple of places (ddl.sgml,
  create_variable.sgml) that "Retrieving a session variable's value
  returns either a NULL or a default value", but as far as I see the
  default value feature is not implemented within first two patches.

* Similar with mentioning immutable session variables in plpgsql.sgml .

* Commentary to LookupVariable mentions a rowtype_only argument:

    +/*
    + * Returns oid of session variable specified by possibly qualified identifier.
    + *
    + * If not found, returns InvalidOid if missing_ok, else throws error.
    + * When rowtype_only argument is true the session variables of not
    + * composite types are ignored. This should to reduce possible collisions.
    + */
    +Oid
    +LookupVariable(const char *nspname,
    +                          const char *varname,
    +                          bool missing_ok)

  but the function doesn't have it.

* I've noticed an interesting result when a LET statement is used to assign a
  value without a subquery:

    create variable test as text;
    -- returns NULL
    select test;

    -- use repeat directly without a subquery
    let test = repeat("test", 100000);

    -- returns NULL
    select test;

  I was expecting to see an error here, is this a correct behaviour?



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 28. 1. 2024 v 19:00 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
Thanks for the update, smaller patches looks promising.

Off the list Pavel has mentioned that the first two patches contain a
bare minimum for session variables, so I've reviewed them once more and
suggest to concentrate on them first. I'm afraid the memory cleanup
patch has to be added to the "bare minimum" set as well -- otherwise in
my tests it was too easy to run out of memory via creating, assigning
and dropping variables. Unfortunately one can't extract those three
patches from the series and apply only them, the memory patch would have
some conflicts. Can you maybe reshuffle the series to have those patches
(1, 2 + 8) as first three?

If that's possible, my proposal would be to proceed with them first. To the
best of my knowledge they look good to me, except few minor details:

* The documentation says in a couple of places (ddl.sgml,
  create_variable.sgml) that "Retrieving a session variable's value
  returns either a NULL or a default value", but as far as I see the
  default value feature is not implemented within first two patches.

* Similar with mentioning immutable session variables in plpgsql.sgml .

* Commentary to LookupVariable mentions a rowtype_only argument:

        +/*
        + * Returns oid of session variable specified by possibly qualified identifier.
        + *
        + * If not found, returns InvalidOid if missing_ok, else throws error.
        + * When rowtype_only argument is true the session variables of not
        + * composite types are ignored. This should to reduce possible collisions.
        + */
        +Oid
        +LookupVariable(const char *nspname,
        +                          const char *varname,
        +                          bool missing_ok)

  but the function doesn't have it.

* I've noticed an interesting result when a LET statement is used to assign a
  value without a subquery:

        create variable test as text;
        -- returns NULL
        select test;

        -- use repeat directly without a subquery
        let test = repeat("test", 100000);

        -- returns NULL
        select test;

  I was expecting to see an error here, is this a correct behaviour?

what is strange on this result?

(2024-01-28 20:32:05) postgres=# let test = 'ab';
LET
(2024-01-28 20:32:12) postgres=# let test = repeat("test", 10);
LET
(2024-01-28 20:32:19) postgres=# select test;
┌──────────────────────┐
│         test         │
╞══════════════════════╡
│ abababababababababab │
└──────────────────────┘
(1 row)

(2024-01-28 20:32:21) postgres=# let test = null;
LET
(2024-01-28 20:32:48) postgres=# let test = repeat("test", 10);
LET
(2024-01-28 20:32:51) postgres=# select test;
┌──────┐
│ test │
╞══════╡
│ ∅    │
└──────┘
(1 row)

(2024-01-28 20:32:53) postgres=# select repeat(test, 10);
┌────────┐
│ repeat │
╞════════╡
│ ∅      │
└────────┘
(1 row)

"repeat" is the usual scalar function. Maybe you thought different function

 

Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Sun, Jan 28, 2024 at 08:34:40PM +0100, Pavel Stehule wrote:
> > * I've noticed an interesting result when a LET statement is used to
> > assign a
> >   value without a subquery:
> >
> >         create variable test as text;
> >         -- returns NULL
> >         select test;
> >
> >         -- use repeat directly without a subquery
> >         let test = repeat("test", 100000);
> >
> >         -- returns NULL
> >         select test;
> >
> >   I was expecting to see an error here, is this a correct behaviour?
> >
>
> what is strange on this result?

Never mind, I've got confused about the quotes here -- it was referring
to the variable content, not a string.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

ne 28. 1. 2024 v 19:00 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
Thanks for the update, smaller patches looks promising.

Off the list Pavel has mentioned that the first two patches contain a
bare minimum for session variables, so I've reviewed them once more and
suggest to concentrate on them first. I'm afraid the memory cleanup
patch has to be added to the "bare minimum" set as well -- otherwise in
my tests it was too easy to run out of memory via creating, assigning
and dropping variables. Unfortunately one can't extract those three
patches from the series and apply only them, the memory patch would have
some conflicts. Can you maybe reshuffle the series to have those patches
(1, 2 + 8) as first three?

probably you need too

0006-function-pg_session_variables-for-cleaning-tests.patch and
0007-DISCARD-VARIABLES.patch
 
6 is necessary for testing of cleaning


If that's possible, my proposal would be to proceed with them first. To the
best of my knowledge they look good to me, except few minor details:

* The documentation says in a couple of places (ddl.sgml,
  create_variable.sgml) that "Retrieving a session variable's value
  returns either a NULL or a default value", but as far as I see the
  default value feature is not implemented within first two patches.

should be fixed
 

* Similar with mentioning immutable session variables in plpgsql.sgml .

fixed
 

* Commentary to LookupVariable mentions a rowtype_only argument:

        +/*
        + * Returns oid of session variable specified by possibly qualified identifier.
        + *
        + * If not found, returns InvalidOid if missing_ok, else throws error.
        + * When rowtype_only argument is true the session variables of not
        + * composite types are ignored. This should to reduce possible collisions.
        + */
        +Oid
        +LookupVariable(const char *nspname,
        +                          const char *varname,
        +                          bool missing_ok)

  but the function doesn't have it.

removed

Regards

Pavel

 

* I've noticed an interesting result when a LET statement is used to assign a
  value without a subquery:

        create variable test as text;
        -- returns NULL
        select test;

        -- use repeat directly without a subquery
        let test = repeat("test", 100000);

        -- returns NULL
        select test;

  I was expecting to see an error here, is this a correct behaviour?
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Mon, Jan 29, 2024 at 08:57:42AM +0100, Pavel Stehule wrote:
> Hi
>
> ne 28. 1. 2024 v 19:00 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> napsal:
>
> > Thanks for the update, smaller patches looks promising.
> >
> > Off the list Pavel has mentioned that the first two patches contain a
> > bare minimum for session variables, so I've reviewed them once more and
> > suggest to concentrate on them first. I'm afraid the memory cleanup
> > patch has to be added to the "bare minimum" set as well -- otherwise in
> > my tests it was too easy to run out of memory via creating, assigning
> > and dropping variables. Unfortunately one can't extract those three
> > patches from the series and apply only them, the memory patch would have
> > some conflicts. Can you maybe reshuffle the series to have those patches
> > (1, 2 + 8) as first three?
> >
>
> probably you need too
>
> 0006-function-pg_session_variables-for-cleaning-tests.patch and
> 0007-DISCARD-VARIABLES.patch
>
> 6 is necessary for testing of cleaning

Ok, let me take a look at those. Unless there are any objections, my
plan would be to give it a final check and mark the CF item as ready for
committer -- meaning the first 5 patches.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


po 29. 1. 2024 v 19:36 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Mon, Jan 29, 2024 at 08:57:42AM +0100, Pavel Stehule wrote:
> Hi
>
> ne 28. 1. 2024 v 19:00 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> napsal:
>
> > Thanks for the update, smaller patches looks promising.
> >
> > Off the list Pavel has mentioned that the first two patches contain a
> > bare minimum for session variables, so I've reviewed them once more and
> > suggest to concentrate on them first. I'm afraid the memory cleanup
> > patch has to be added to the "bare minimum" set as well -- otherwise in
> > my tests it was too easy to run out of memory via creating, assigning
> > and dropping variables. Unfortunately one can't extract those three
> > patches from the series and apply only them, the memory patch would have
> > some conflicts. Can you maybe reshuffle the series to have those patches
> > (1, 2 + 8) as first three?
> >
>
> probably you need too
>
> 0006-function-pg_session_variables-for-cleaning-tests.patch and
> 0007-DISCARD-VARIABLES.patch
>
> 6 is necessary for testing of cleaning

Ok, let me take a look at those. Unless there are any objections, my
plan would be to give it a final check and mark the CF item as ready for
committer -- meaning the first 5 patches.

sure.

Thank you very much.

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
Yep, in this constellation the implementation holds much better (in
terms of memory) in my create/let/drop testing.

I've marked the CF item as ready for committer, but a note for anyone
who would like to pick up it from here -- we're talking about first 5
patches here, up to the memory cleaning after DROP VARIABLE. It doesn't
mean the rest is somehow not worth it, but I believe it's a good first
step.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


út 30. 1. 2024 v 20:15 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
Yep, in this constellation the implementation holds much better (in
terms of memory) in my create/let/drop testing.

I've marked the CF item as ready for committer, but a note for anyone
who would like to pick up it from here -- we're talking about first 5
patches here, up to the memory cleaning after DROP VARIABLE. It doesn't
mean the rest is somehow not worth it, but I believe it's a good first
step.

Thank you very much

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi,

here is new rebase of this patch set.

Years ago I promised to implement support for transactional behaviour. I wrote it in patch 0019. It is based on my patch from 2020 but the memory cleaning is more readable and I believe it is correct. All other patches are without touching. The first five patches are of "should to have" type, all others (with new one) are "nice to have" type (although support for simply expr evaluation or parallel execution has strong benefits).  

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

út 20. 2. 2024 v 20:29 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi,

here is new rebase of this patch set.

Years ago I promised to implement support for transactional behaviour. I wrote it in patch 0019. It is based on my patch from 2020 but the memory cleaning is more readable and I believe it is correct. All other patches are without touching. The first five patches are of "should to have" type, all others (with new one) are "nice to have" type (although support for simply expr evaluation or parallel execution has strong benefits).  

fresh rebase
 

Regards

Pavel
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Alvaro Herrera
Date:
On 2024-Jan-30, Dmitry Dolgov wrote:

> Yep, in this constellation the implementation holds much better (in
> terms of memory) in my create/let/drop testing.
> 
> I've marked the CF item as ready for committer, but a note for anyone
> who would like to pick up it from here -- we're talking about first 5
> patches here, up to the memory cleaning after DROP VARIABLE. It doesn't
> mean the rest is somehow not worth it, but I believe it's a good first
> step.

Hmm, I think patch 16 is essential, because the point of variable shadowing
is a critical aspect of how the whole thing works.  So I would say that
a first step would be those first five patches plus 16.

I want to note that when we discussed this patch series at the dev
meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
schema variables at all because of the fact that creating a variable
would potentially change the meaning of queries by shadowing table
columns.  But this turns out to be incorrect: it's _variables_ that are
shadowed by table columns, not the other way around.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No hay ausente sin culpa ni presente sin disculpa" (Prov. francés)



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 18. 5. 2024 v 18:31 odesílatel Alvaro Herrera <alvherre@alvh.no-ip.org> napsal:
On 2024-Jan-30, Dmitry Dolgov wrote:

> Yep, in this constellation the implementation holds much better (in
> terms of memory) in my create/let/drop testing.
>
> I've marked the CF item as ready for committer, but a note for anyone
> who would like to pick up it from here -- we're talking about first 5
> patches here, up to the memory cleaning after DROP VARIABLE. It doesn't
> mean the rest is somehow not worth it, but I believe it's a good first
> step.

Hmm, I think patch 16 is essential, because the point of variable shadowing
is a critical aspect of how the whole thing works.  So I would say that
a first step would be those first five patches plus 16.

I'll move patch 16 to 6 position

Regards

Pavel

I want to note that when we discussed this patch series at the dev
meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
schema variables at all because of the fact that creating a variable
would potentially change the meaning of queries by shadowing table
columns.  But this turns out to be incorrect: it's _variables_ that are
shadowed by table columns, not the other way around.

--
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No hay ausente sin culpa ni presente sin disculpa" (Prov. francés)

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

po 20. 5. 2024 v 9:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


so 18. 5. 2024 v 18:31 odesílatel Alvaro Herrera <alvherre@alvh.no-ip.org> napsal:
On 2024-Jan-30, Dmitry Dolgov wrote:

> Yep, in this constellation the implementation holds much better (in
> terms of memory) in my create/let/drop testing.
>
> I've marked the CF item as ready for committer, but a note for anyone
> who would like to pick up it from here -- we're talking about first 5
> patches here, up to the memory cleaning after DROP VARIABLE. It doesn't
> mean the rest is somehow not worth it, but I believe it's a good first
> step.

Hmm, I think patch 16 is essential, because the point of variable shadowing
is a critical aspect of how the whole thing works.  So I would say that
a first step would be those first five patches plus 16.

I'll move patch 16 to 6 position

reorderd set of patches - I moved forward  plpgsql-tests.patch and GUC-session_variables_ambiguity_warning.patch

0006-plpgsql-tests.patch
0007-GUC-session_variables_ambiguity_warning.patch

no other changes

Regards

Pavel
 

Regards

Pavel

I want to note that when we discussed this patch series at the dev
meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
schema variables at all because of the fact that creating a variable
would potentially change the meaning of queries by shadowing table
columns.  But this turns out to be incorrect: it's _variables_ that are
shadowed by table columns, not the other way around.

--
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No hay ausente sin culpa ni presente sin disculpa" (Prov. francés)
Attachment

Re: Schema variables - new implementation for Postgres 15

From
Peter Eisentraut
Date:
On 18.05.24 13:29, Alvaro Herrera wrote:
> I want to note that when we discussed this patch series at the dev
> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
> schema variables at all because of the fact that creating a variable
> would potentially change the meaning of queries by shadowing table
> columns.  But this turns out to be incorrect: it's_variables_  that are
> shadowed by table columns, not the other way around.

But that's still bad, because seemingly unrelated schema changes can 
make variables appear and disappear.  For example, if you have

SELECT a, b FROM table1

and then you drop column b, maybe the above query continues to work 
because there is also a variable b.  Or maybe it now does different 
things because b is of a different type.  This all has the potential to 
be very confusing.




Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Wed, May 22, 2024 at 02:37:49PM +0200, Peter Eisentraut wrote:
> On 18.05.24 13:29, Alvaro Herrera wrote:
> > I want to note that when we discussed this patch series at the dev
> > meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
> > schema variables at all because of the fact that creating a variable
> > would potentially change the meaning of queries by shadowing table
> > columns.  But this turns out to be incorrect: it's_variables_  that are
> > shadowed by table columns, not the other way around.
>
> But that's still bad, because seemingly unrelated schema changes can make
> variables appear and disappear.  For example, if you have
>
> SELECT a, b FROM table1
>
> and then you drop column b, maybe the above query continues to work because
> there is also a variable b.  Or maybe it now does different things because b
> is of a different type.  This all has the potential to be very confusing.

Yeah, that's a bummer. Interestingly enough, the db2 implementation of
global session variables mechanism is mentioned as similar to what we
have in the patch. But weirdly, the db2 documentation just states
possibility of a resolution conflict for unqualified names, nothing
else.

There was extensive discussion about this problem early in the thread,
and one alternative is to use some sort of special syntax every time
when working with a variable to clear any ambiguity [1]. It's more
verbose, has to be careful to not block some useful syntax for other
stuff, etc. But as Pavel said:

> The different syntax disallows any collision well, it is far to what is
> more usual standard in this area. And if we introduce special syntax, then
> there is no way back. We cannot use :varname - this syntax is used already,
> but we can use, theoretically, @var or $var. But, personally, I don't want
> to use it, if there is possibility to do without it.

It seems to me there is no other possibility to resolve those ambiguity
issues.

[1]: https://www.postgresql.org/message-id/CAFj8pRD03hwZK%2B541KDt4Eo5YuC81CBBX_P0Sa5A7g5TQFsTww%40mail.gmail.com



Re: Schema variables - new implementation for Postgres 15

From
Tom Lane
Date:
Peter Eisentraut <peter@eisentraut.org> writes:
> On 18.05.24 13:29, Alvaro Herrera wrote:
>> I want to note that when we discussed this patch series at the dev
>> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
>> schema variables at all because of the fact that creating a variable
>> would potentially change the meaning of queries by shadowing table
>> columns.  But this turns out to be incorrect: it's_variables_  that are
>> shadowed by table columns, not the other way around.

> But that's still bad, because seemingly unrelated schema changes can 
> make variables appear and disappear.  For example, if you have
>     SELECT a, b FROM table1
> and then you drop column b, maybe the above query continues to work 
> because there is also a variable b.

Yeah, that seems pretty dangerous.  Could we make it safe enough
by requiring some qualification on variable names?  That is, if
you mean b to be a variable, then you must write something like

    SELECT a, pg_variables.b FROM table1

This is still ambiguous if you use "pg_variables" as a table alias in
the query, but the alias would win so the query still means what it
meant before.  Also, table aliases (as opposed to actual table names)
don't change readily, so I don't think there's much risk of the query
suddenly meaning something different than it did yesterday.

            regards, tom lane



Re: Schema variables - new implementation for Postgres 15

From
Alvaro Herrera
Date:
On 2024-May-22, Dmitry Dolgov wrote:

> Yeah, that's a bummer. Interestingly enough, the db2 implementation of
> global session variables mechanism is mentioned as similar to what we
> have in the patch. But weirdly, the db2 documentation just states
> possibility of a resolution conflict for unqualified names, nothing
> else.

Perhaps the solution to all this is to avoid having the variables be
implicitly present in the range table of all queries.  Instead, if you
need a variable's value, then you need to add the variable to the FROM
clause; and if you try to read from the variable and the name conflicts
with that of a column in one of the tables in the FROM clause, then you
get an error that the name is ambiguous and invites to qualify it.
Like, for instance,

create table lefttab (a int, b int);
create table righttab (c int, d int, b int);

=# select b from lefttab, righttab;
ERROR:  column reference "b" is ambiguous
LÍNEA 1: select b from lefttab, righttab;
                ^

but this works fine because there's no longer an ambiguity:

select lefttab.b from lefttab, righttab;
 b 
───
(0 filas)


Nothing breaks if you create new variables, because your queries won't
see them until you explicitly request them.  And if you add add columns
to either tables or variables, it's possible that some queries would
start having ambiguous references, in which case they'll just stop
working until you disambiguate by editing the query.


Now, Pavel has been saying that variables are simple and cannot break
queries (because they're always shadowed), which is why they're always
implicitly visible to all queries[1]; but maybe that's a mistake.

[1] https://postgr.es/m/CAFj8pRA2P7uaFGpFJxVHrHFtizBCN41J00BrEotspdD+urGBLQ@mail.gmail.com

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La experiencia nos dice que el hombre peló millones de veces las patatas,
pero era forzoso admitir la posibilidad de que en un caso entre millones,
las patatas pelarían al hombre" (Ijon Tichy)



Re: Schema variables - new implementation for Postgres 15

From
walther@technowledgy.de
Date:
Alvaro Herrera:
> Perhaps the solution to all this is to avoid having the variables be
> implicitly present in the range table of all queries.  Instead, if you
> need a variable's value, then you need to add the variable to the FROM
> clause;

+1

This should make it easier to work with composite type schema variables 
in some cases.  It could also enable schema qualifying of schema 
variables, or at least make it easier to do, I think.

In this case variables would share the same namespace as tables and 
views, right?  So I could not create a variable with the same name as 
another table.  Which is a good thing, I guess.  Not sure how it's 
currently implemented in the patch.

Best,

Wolfgang



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 22. 5. 2024 v 14:37 odesílatel Peter Eisentraut <peter@eisentraut.org> napsal:
On 18.05.24 13:29, Alvaro Herrera wrote:
> I want to note that when we discussed this patch series at the dev
> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
> schema variables at all because of the fact that creating a variable
> would potentially change the meaning of queries by shadowing table
> columns.  But this turns out to be incorrect: it's_variables_  that are
> shadowed by table columns, not the other way around.

But that's still bad, because seemingly unrelated schema changes can
make variables appear and disappear.  For example, if you have

SELECT a, b FROM table1

and then you drop column b, maybe the above query continues to work
because there is also a variable b.  Or maybe it now does different
things because b is of a different type.  This all has the potential to
be very confusing.

In the described case, the variable's shadowing warning will be raised.

There are more cases where not well designed changes (just with tables) can break queries or change results. Adding columns can be a potential risk, creating tables or dropping tables (when the search path contains more schemas) too.

Good practice is using well designed names and almost all use aliases or labels, and it is one way to minimize real risks. Personally I prefer a very strict mode that disallows shadowing, conflicts, ... but on second hand, for some usual work this strict mode can be boring, so we should find some good compromise.

Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 22. 5. 2024 v 19:25 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Peter Eisentraut <peter@eisentraut.org> writes:
> On 18.05.24 13:29, Alvaro Herrera wrote:
>> I want to note that when we discussed this patch series at the dev
>> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
>> schema variables at all because of the fact that creating a variable
>> would potentially change the meaning of queries by shadowing table
>> columns.  But this turns out to be incorrect: it's_variables_  that are
>> shadowed by table columns, not the other way around.

> But that's still bad, because seemingly unrelated schema changes can
> make variables appear and disappear.  For example, if you have
>       SELECT a, b FROM table1
> and then you drop column b, maybe the above query continues to work
> because there is also a variable b.

Yeah, that seems pretty dangerous.  Could we make it safe enough
by requiring some qualification on variable names?  That is, if
you mean b to be a variable, then you must write something like

        SELECT a, pg_variables.b FROM table1

This is still ambiguous if you use "pg_variables" as a table alias in
the query, but the alias would win so the query still means what it
meant before.  Also, table aliases (as opposed to actual table names)
don't change readily, so I don't think there's much risk of the query
suddenly meaning something different than it did yesterday.

With active shadowing variable warning for described example you will get a warning before dropping.

Session variables are joined with schema (in my proposal). Do anybody can do just

CREATE SCHEMA svars; -- or what (s)he likes
CREATE VARIABLE svars.b AS int;

SELECT a, b FROM table1

and if somebody can be really safe, the can write

SELECT t.a, t.b FROM table1 t

or

SELECT t.a, svars.b FROM table1 t

It can be customized in the way anybody prefers - just creating dedicated schemas and setting search_path. Using its own schema for session variables without enhancing search_path for this schema forces the necessity to set only qualified names for session variables.

Sure the naming of schemas, aliases can be unhappy wrong, and there can be the problem. But this can be a problem today too.

Regards

Pavel
 

                        regards, tom lane

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 22. 5. 2024 v 20:21 odesílatel <walther@technowledgy.de> napsal:
Alvaro Herrera:
> Perhaps the solution to all this is to avoid having the variables be
> implicitly present in the range table of all queries.  Instead, if you
> need a variable's value, then you need to add the variable to the FROM
> clause;

+1

This should make it easier to work with composite type schema variables
in some cases.  It could also enable schema qualifying of schema
variables, or at least make it easier to do, I think.

In this case variables would share the same namespace as tables and
views, right?  So I could not create a variable with the same name as
another table.  Which is a good thing, I guess.  Not sure how it's
currently implemented in the patch.

I don't like this. Sure, this fixes the problem with collisions, but then we cannot talk about variables. When some is used like a table, then it should be a table. I can imagine memory tables, but it is a different type of object. Table is relation, variable is just value. Variables should not have columns, so using the same patterns for tables and variables has no sense. Using the same catalog for variables and tables. Variables just hold a value, and then you can use it inside a query without necessity to write JOIN. Variables are not tables, and then it is not too confusing so they are not transactional and don't support more rows, more columns.

The problem with collision can be solved very easily - just use a dedicated schema (only for variables) and don't use it in the search path.

In this case, the unwanted collision is not too probable - although it is possible, if you use a schema name for a variable same like table name or alias name.

I can use

CREATE SCHEMA __;
CREATE VARIABLE __.a AS int;

SELECT __.a;

although it is maybe wild, probably nobody will use alias or table name __ and then there should not be any problem





 

Best,

Wolfgang

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


st 22. 5. 2024 v 14:37 odesílatel Peter Eisentraut <peter@eisentraut.org> napsal:
On 18.05.24 13:29, Alvaro Herrera wrote:
> I want to note that when we discussed this patch series at the dev
> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
> schema variables at all because of the fact that creating a variable
> would potentially change the meaning of queries by shadowing table
> columns.  But this turns out to be incorrect: it's_variables_  that are
> shadowed by table columns, not the other way around.

But that's still bad, because seemingly unrelated schema changes can
make variables appear and disappear.  For example, if you have

SELECT a, b FROM table1

and then you drop column b, maybe the above query continues to work
because there is also a variable b.  Or maybe it now does different
things because b is of a different type.  This all has the potential to
be very confusing.

The detection of possible conflicts works well (in or outside PL too)

create variable x as int;
create table foo(x int);
insert into foo values(110);

set session_variables_ambiguity_warning to on;

(2024-05-23 08:22:34) postgres=# do $$                                        
begin
  raise notice '%', (select x from foo);
end;
$$;
WARNING:  session variable "x" is shadowed
LINE 1: (select x from foo)
                ^
DETAIL:  Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name.
QUERY:  (select x from foo)
NOTICE:  110
DO
(2024-05-23 08:22:35) postgres=# do $$ declare x int default 100;
begin
  raise notice '%', x;
end;
$$;
WARNING:  session variable "x" is shadowed
LINE 1: x
        ^
DETAIL:  Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name.
QUERY:  x
NOTICE:  100
DO


Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

st 22. 5. 2024 v 16:14 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Wed, May 22, 2024 at 02:37:49PM +0200, Peter Eisentraut wrote:
> On 18.05.24 13:29, Alvaro Herrera wrote:
> > I want to note that when we discussed this patch series at the dev
> > meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
> > schema variables at all because of the fact that creating a variable
> > would potentially change the meaning of queries by shadowing table
> > columns.  But this turns out to be incorrect: it's_variables_  that are
> > shadowed by table columns, not the other way around.
>
> But that's still bad, because seemingly unrelated schema changes can make
> variables appear and disappear.  For example, if you have
>
> SELECT a, b FROM table1
>
> and then you drop column b, maybe the above query continues to work because
> there is also a variable b.  Or maybe it now does different things because b
> is of a different type.  This all has the potential to be very confusing.

Yeah, that's a bummer. Interestingly enough, the db2 implementation of
global session variables mechanism is mentioned as similar to what we
have in the patch. But weirdly, the db2 documentation just states
possibility of a resolution conflict for unqualified names, nothing
else.


If I understand well, then the same rules are applied for qualified or not qualified identifiers (when there is a conflict), and the variables have low priority.

The db2 has the possibility to compile objects, and it can block the usage variables created after compilation - (if I understand well the described behaviour).

Regards

Pavel

Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Wed, May 22, 2024 at 08:44:28PM +0200, Pavel Stehule wrote:
> st 22. 5. 2024 v 19:25 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
>
> > Peter Eisentraut <peter@eisentraut.org> writes:
> > > On 18.05.24 13:29, Alvaro Herrera wrote:
> > >> I want to note that when we discussed this patch series at the dev
> > >> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
> > >> schema variables at all because of the fact that creating a variable
> > >> would potentially change the meaning of queries by shadowing table
> > >> columns.  But this turns out to be incorrect: it's_variables_  that are
> > >> shadowed by table columns, not the other way around.
> >
> > > But that's still bad, because seemingly unrelated schema changes can
> > > make variables appear and disappear.  For example, if you have
> > >       SELECT a, b FROM table1
> > > and then you drop column b, maybe the above query continues to work
> > > because there is also a variable b.
> >
> > Yeah, that seems pretty dangerous.  Could we make it safe enough
> > by requiring some qualification on variable names?  That is, if
> > you mean b to be a variable, then you must write something like
> >
> >         SELECT a, pg_variables.b FROM table1
> >
> > This is still ambiguous if you use "pg_variables" as a table alias in
> > the query, but the alias would win so the query still means what it
> > meant before.  Also, table aliases (as opposed to actual table names)
> > don't change readily, so I don't think there's much risk of the query
> > suddenly meaning something different than it did yesterday.
> >
>
> With active shadowing variable warning for described example you will get a
> warning before dropping.

I assume you're talking about a warning, which one will get querying the
table with shadowed columns. If no such query has happened yet and the
column was dropped, there will be no warning.

Aside that, I'm afraid dropping a warning in log does not have
sufficient visibility to warn about the issue, since one needs to read
those logs first. I guess what folks are looking for is more constraints
out of the box, preventing any ambiguity.

> Session variables are joined with schema (in my proposal). Do anybody can
> do just
>
> CREATE SCHEMA svars; -- or what (s)he likes
> CREATE VARIABLE svars.b AS int;
>
> SELECT a, b FROM table1
>
> and if somebody can be really safe, the can write
>
> SELECT t.a, t.b FROM table1 t
>
> or
>
> SELECT t.a, svars.b FROM table1 t
>
> It can be customized in the way anybody prefers - just creating dedicated
> schemas and setting search_path. Using its own schema for session variables
> without enhancing search_path for this schema forces the necessity to set
> only qualified names for session variables.
>
> Sure the naming of schemas, aliases can be unhappy wrong, and there can be
> the problem. But this can be a problem today too.

If I understand you correctly, you're saying that there are "best
practices" how to deal with session variables to avoid any potential
issues. But I think it's more user-friendly to have something that will
not allow shooting yourself in the foot right out of the box. You're
right, similar things could probably happen with the already existing
functionality, but it doesn't give us rights to add more to it.
Especially if it's going to be about a brand-new feature.

As far as I can see now, it's a major design flaw that could keep the
patch from being accepted. Fortunately there are few good proposals how
to address this, folks are genuinely trying to help. What do you think
about trying some of them out, as an alternative approach, to compare
functionality and user experience?

In the meantime I'm afraid I have to withdraw "Ready for committer"
status, sorry. I've clearly underestimated the importance of variables
shadowing, thanks Alvaro and Peter for pointing out some dangerous
cases. I still believe though that the majority of the patch is in a
good shape and the question about variables shadowing is the only thing
that keeps it from moving forward.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

pá 24. 5. 2024 v 13:32 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Wed, May 22, 2024 at 08:44:28PM +0200, Pavel Stehule wrote:
> st 22. 5. 2024 v 19:25 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
>
> > Peter Eisentraut <peter@eisentraut.org> writes:
> > > On 18.05.24 13:29, Alvaro Herrera wrote:
> > >> I want to note that when we discussed this patch series at the dev
> > >> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
> > >> schema variables at all because of the fact that creating a variable
> > >> would potentially change the meaning of queries by shadowing table
> > >> columns.  But this turns out to be incorrect: it's_variables_  that are
> > >> shadowed by table columns, not the other way around.
> >
> > > But that's still bad, because seemingly unrelated schema changes can
> > > make variables appear and disappear.  For example, if you have
> > >       SELECT a, b FROM table1
> > > and then you drop column b, maybe the above query continues to work
> > > because there is also a variable b.
> >
> > Yeah, that seems pretty dangerous.  Could we make it safe enough
> > by requiring some qualification on variable names?  That is, if
> > you mean b to be a variable, then you must write something like
> >
> >         SELECT a, pg_variables.b FROM table1
> >
> > This is still ambiguous if you use "pg_variables" as a table alias in
> > the query, but the alias would win so the query still means what it
> > meant before.  Also, table aliases (as opposed to actual table names)
> > don't change readily, so I don't think there's much risk of the query
> > suddenly meaning something different than it did yesterday.
> >
>
> With active shadowing variable warning for described example you will get a
> warning before dropping.

I assume you're talking about a warning, which one will get querying the
table with shadowed columns. If no such query has happened yet and the
column was dropped, there will be no warning.

sure - the possible identifier collision cannot be solved in SQL perfectly. It is the same with tables.
When I add badly named column to table, I'll get an error "ambiguous  columns" just when I'll execute
query. The system catalog just cannot protect against collisions - it is true for columns, variables, tables.
Little bit protected are views, that are stored in parsed format, but any other object can be broken when
somebody choose bad names in catalog or queries. There is not any protection.
 

Aside that, I'm afraid dropping a warning in log does not have
sufficient visibility to warn about the issue, since one needs to read
those logs first. I guess what folks are looking for is more constraints
out of the box, preventing any ambiguity.

We can increase (optionality) the level of this message to error. It is not perfect, but it can work well.

I think so there is not higher risk with variables than current risk with just tables.

a) the possibility to create variables is limited by rights on schema. So nobody can create variables without necessary rights (invisibly)

b) if user has own schema with CREATE right, then it can create variables just for self, and with default setting, just visible for self, 
and just accessible for self. When other users try to use these variables, then the query fails due to missing access rights (usually).
Common user cannot to create variables in application schema and cannot to set search_path for applications.

c) the changes of schema are usually tested on some testing stages before are applied on production. So when there
will be possible collision or some other defect, probably it will be detected there. Untested changes of catalog on production is not too common
today.

d) any risk that can be related for variables, is related just to renaming column or table.

 

> Session variables are joined with schema (in my proposal). Do anybody can
> do just
>
> CREATE SCHEMA svars; -- or what (s)he likes
> CREATE VARIABLE svars.b AS int;
>
> SELECT a, b FROM table1
>
> and if somebody can be really safe, the can write
>
> SELECT t.a, t.b FROM table1 t
>
> or
>
> SELECT t.a, svars.b FROM table1 t
>
> It can be customized in the way anybody prefers - just creating dedicated
> schemas and setting search_path. Using its own schema for session variables
> without enhancing search_path for this schema forces the necessity to set
> only qualified names for session variables.
>
> Sure the naming of schemas, aliases can be unhappy wrong, and there can be
> the problem. But this can be a problem today too.

If I understand you correctly, you're saying that there are "best
practices" how to deal with session variables to avoid any potential
issues. But I think it's more user-friendly to have something that will
not allow shooting yourself in the foot right out of the box. You're
right, similar things could probably happen with the already existing
functionality, but it doesn't give us rights to add more to it.
Especially if it's going to be about a brand-new feature.

Unfortunately, there is not any possibility - just in SQL (without introduction of variables).

Example - Tom's proposal using dedicated schema

ok - I can limit the possibility to create variables just for schema "pg_var"

CREATE VARIABLE pg_var.a AS int;

but if somebody will write query like

SELECT pg_var.a FROM tab pg_var

then we are back on start.

 

As far as I can see now, it's a major design flaw that could keep the
patch from being accepted. Fortunately there are few good proposals how
to address this, folks are genuinely trying to help. What do you think
about trying some of them out, as an alternative approach, to compare
functionality and user experience?

It is a design flaw of SQL. The issue we talk about is the generic property of SQL, and then you cannot fix it.

I thought about possibility to introduce dedicated function

svalue(regvariable) returns any - with planner support

and possibility to force usage of this function. Another possibility is using some simple dedicated operator (syntax) for force using of variables
so theoretically this can looks like:

set strict_usage_of_session_variables to on;
SELECT * FROM tab WHERE a = svalue('myvar.var');
or

SELECT * FROM tab WHERE a = @ myvar.var;

This can be really safe. Personally It is not my cup of tea, but I can live it (and this mode can be default).

Theoretically we can limit usage of variables just for PL/pgSQL. It can reduce risks too, but it breaks usage variables for parametrization of DO blocks (what is my primary motivation), but it can be good enough to support migration from PL/SQL.
 

In the meantime I'm afraid I have to withdraw "Ready for committer"
status, sorry. I've clearly underestimated the importance of variables
shadowing, thanks Alvaro and Peter for pointing out some dangerous
cases. I still believe though that the majority of the patch is in a
good shape and the question about variables shadowing is the only thing
that keeps it from moving forward.

I understand.

I'll try to recapitulate my objections against proposed designs

a) using syntax like MS - DECLARE command and '@@' prefix - it is dynamic, so there is not possibility of static check. It is not joined with schema, so there are possible collisions between variables and and the end the variables are named like @@mypackage_myvar - so some custom naming convention is necessary too. There is not possibility to set access rights.

b) using variables like MySQL - first usage define it, and access by '@' prefix. It is simple, but without possibility of static check. There is not possibility to set access rights.

c) using variables with necessity to define it in FROM clause. It is safe, but it can be less readable, when you use more variables, and it is not too readable, and user friendly, because you need to write FROM. And can be messy, because you usually will use variables in queries, and it is introduce not relations into FROM clause. But I can imagine this mode as alternative syntax, but it is very unfriendly and not intuitive (I think). More probably it doesn't fast execution in simple expression execution mode.

d) my proposal - there is possibility of collisions, but consistent with naming of database objects, allows set of access rights, allows static analyze, consistent with PL/pgSQL and similar to PL/pgSQL.

There is not any other possibility. Any time this is war between be user friendly, be readable, be correctly - but there is not perfect solution, because just SQL is not perfect. Almost all mentioned objections against proposed variables are valid just for tables and columns.

Regards

Pavel
 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:

 

As far as I can see now, it's a major design flaw that could keep the
patch from being accepted. Fortunately there are few good proposals how
to address this, folks are genuinely trying to help. What do you think
about trying some of them out, as an alternative approach, to compare
functionality and user experience?

It is a design flaw of SQL. The issue we talk about is the generic property of SQL, and then you cannot fix it.

I thought about possibility to introduce dedicated function

svalue(regvariable) returns any - with planner support

and possibility to force usage of this function. Another possibility is using some simple dedicated operator (syntax) for force using of variables
so theoretically this can looks like:

set strict_usage_of_session_variables to on;
SELECT * FROM tab WHERE a = svalue('myvar.var');
or

SELECT * FROM tab WHERE a = @ myvar.var;

This can be really safe. Personally It is not my cup of tea, but I can live it (and this mode can be default).

Theoretically we can limit usage of variables just for PL/pgSQL. It can reduce risks too, but it breaks usage variables for parametrization of DO blocks (what is my primary motivation), but it can be good enough to support migration from PL/SQL.

another possibility can be disable / enable usage of session variables on session level

like set enable_session_variable to on/off

so when the application doesn't use session variables, and then session variables can be disabled, but the user can enable it just for self for self session. Then the risk of unwanted usage of session variables can be zero. This is similar to discussion about login triggers. This mechanism can be used for using session variables only in PL too.


 
 

In the meantime I'm afraid I have to withdraw "Ready for committer"
status, sorry. I've clearly underestimated the importance of variables
shadowing, thanks Alvaro and Peter for pointing out some dangerous
cases. I still believe though that the majority of the patch is in a
good shape and the question about variables shadowing is the only thing
that keeps it from moving forward.

I understand.

I'll try to recapitulate my objections against proposed designs

a) using syntax like MS - DECLARE command and '@@' prefix - it is dynamic, so there is not possibility of static check. It is not joined with schema, so there are possible collisions between variables and and the end the variables are named like @@mypackage_myvar - so some custom naming convention is necessary too. There is not possibility to set access rights.

b) using variables like MySQL - first usage define it, and access by '@' prefix. It is simple, but without possibility of static check. There is not possibility to set access rights.

c) using variables with necessity to define it in FROM clause. It is safe, but it can be less readable, when you use more variables, and it is not too readable, and user friendly, because you need to write FROM. And can be messy, because you usually will use variables in queries, and it is introduce not relations into FROM clause. But I can imagine this mode as alternative syntax, but it is very unfriendly and not intuitive (I think). More probably it doesn't fast execution in simple expression execution mode.

d) my proposal - there is possibility of collisions, but consistent with naming of database objects, allows set of access rights, allows static analyze, consistent with PL/pgSQL and similar to PL/pgSQL.

There is not any other possibility. Any time this is war between be user friendly, be readable, be correctly - but there is not perfect solution, because just SQL is not perfect. Almost all mentioned objections against proposed variables are valid just for tables and columns.

Regards

Pavel
 

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

st 22. 5. 2024 v 19:25 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Peter Eisentraut <peter@eisentraut.org> writes:
> On 18.05.24 13:29, Alvaro Herrera wrote:
>> I want to note that when we discussed this patch series at the dev
>> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
>> schema variables at all because of the fact that creating a variable
>> would potentially change the meaning of queries by shadowing table
>> columns.  But this turns out to be incorrect: it's_variables_  that are
>> shadowed by table columns, not the other way around.

> But that's still bad, because seemingly unrelated schema changes can
> make variables appear and disappear.  For example, if you have
>       SELECT a, b FROM table1
> and then you drop column b, maybe the above query continues to work
> because there is also a variable b.

Yeah, that seems pretty dangerous.  Could we make it safe enough
by requiring some qualification on variable names?  That is, if
you mean b to be a variable, then you must write something like

        SELECT a, pg_variables.b FROM table1

This is still ambiguous if you use "pg_variables" as a table alias in
the query, but the alias would win so the query still means what it
meant before.  Also, table aliases (as opposed to actual table names)
don't change readily, so I don't think there's much risk of the query
suddenly meaning something different than it did yesterday.

we can introduce special safe mode started by

set enable_direct_variable_read to off;

and allowing access to variables only by usage dedicated function (supported by parser) named like variable or pg_variable

so it can looks like

select a, pg_variable(myschema.myvar) from table 

In this mode, the variables never are readable directly, so there is no risk of collision and issue mentioned by Peter. And the argument of the pg_variable pseudo function can be only variable, so risk of possible collision can be reduced too. The pseudo function pg_variable can be used in less restrictive mode too, when the user can explicitly show usage of the variable. 

Tom's proposal is already almost supported now. The user can use a dedicated schema without assigning this schema to search_path. Then a qualified name should be required.

Can this design be the correct answer for mentioned objections?

 Regards

Pavel



                        regards, tom lane

Re: Schema variables - new implementation for Postgres 15

From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> we can introduce special safe mode started by
> set enable_direct_variable_read to off;
> and allowing access to variables only by usage dedicated function
> (supported by parser) named like variable or pg_variable

Didn't we learn twenty years ago that GUCs that change query
semantics are an awful idea?  Pick a single access method
for these things and stick to it.

            regards, tom lane



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 25. 5. 2024 v 3:29 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> we can introduce special safe mode started by
> set enable_direct_variable_read to off;
> and allowing access to variables only by usage dedicated function
> (supported by parser) named like variable or pg_variable

Didn't we learn twenty years ago that GUCs that change query
semantics are an awful idea?  Pick a single access method
for these things and stick to it.

I don't think the proposed GUC exactly changes query semantics - it is equivalent of plpgsql options: plpgsql_extra_xxxx or #variable_conflict. It allows us to identify broken queries. And for tools that generates queries is not problem to wrap reading variable by special pseudo function. The code where pseudo function will be used should to work with active or inactive strict mode (related to possibility to use variables).

Sure there is more possibilities, but I don't want to lost the possibility to write code like

CREATE TEMP VARIABLE _x;

LET _x = 'hello';

DO $$
BEGIN
  RAISE NOTICE '%', _x;
END;
$$;

So I am searching for a way to do it safely, but still intuitive and user friendly.

Regards

Pavel

 

                        regards, tom lane

Re: Schema variables - new implementation for Postgres 15

From
walther@technowledgy.de
Date:
Pavel Stehule:
> Sure there is more possibilities, but I don't want to lost the 
> possibility to write code like
> 
> CREATE TEMP VARIABLE _x;
> 
> LET _x = 'hello';
> 
> DO $$
> BEGIN
>    RAISE NOTICE '%', _x;
> END;
> $$;
> 
> So I am searching for a way to do it safely, but still intuitive and 
> user friendly.

Maybe a middle-way between this and Alvaro's proposal could be:

Whenever you have a FROM clause, a variable must be added to it to be 
accessible.  When you don't have a FROM clause, you can access it directly.

This would make the following work:

RAISE NOTICE '%', _x;

SELECT _x;

SELECT tbl.*, _x FROM tbl, _x;

SELECT tbl.*, (SELECT _x) FROM tbl, _x;

SELECT tbl.*, (SELECT _x FROM _x) FROM tbl;


But the following would be an error:

SELECT tbl.*, _x FROM tbl;

SELECT tbl.*, (SELECT _x) FROM tbl;


Best,

Wolfgang



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


so 25. 5. 2024 v 10:24 odesílatel <walther@technowledgy.de> napsal:
Pavel Stehule:
> Sure there is more possibilities, but I don't want to lost the
> possibility to write code like
>
> CREATE TEMP VARIABLE _x;
>
> LET _x = 'hello';
>
> DO $$
> BEGIN
>    RAISE NOTICE '%', _x;
> END;
> $$;
>
> So I am searching for a way to do it safely, but still intuitive and
> user friendly.

Maybe a middle-way between this and Alvaro's proposal could be:

Whenever you have a FROM clause, a variable must be added to it to be
accessible.  When you don't have a FROM clause, you can access it directly.

This would make the following work:

RAISE NOTICE '%', _x;

SELECT _x;

SELECT tbl.*, _x FROM tbl, _x;

SELECT tbl.*, (SELECT _x) FROM tbl, _x;

SELECT tbl.*, (SELECT _x FROM _x) FROM tbl;


But the following would be an error:

SELECT tbl.*, _x FROM tbl;

SELECT tbl.*, (SELECT _x) FROM tbl;


It looks odd - It is not intuitive, it introduces new inconsistency inside Postgres, or with solutions in other databases. No other database has a similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird will be confused. Users that use PL/pgSQL will be confused.

Regards

Pavel
 

Best,

Wolfgang

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi

so 25. 5. 2024 v 3:29 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> we can introduce special safe mode started by
> set enable_direct_variable_read to off;
> and allowing access to variables only by usage dedicated function
> (supported by parser) named like variable or pg_variable

Didn't we learn twenty years ago that GUCs that change query
semantics are an awful idea?  Pick a single access method
for these things and stick to it.

I propose another variants. First we can introduce pseudo function VAR( ). The argument should be session variables. The name of this function can be pgvar, globvar, ... We can talk about good name, it should not be too long, but it is not important now. The VAR() function will be pseudo function like COALESCE, so we can easily to set correct result type.

I see possible variants

1. for any read of session variable, the VAR function should be used (everywhere), the write is not problem, there is not risk of collisions. When VAR() function will be required everywhere, then the name should be shorter.

SELECT * FROM tab WHERE id = VAR(stehule.myvar);
SELECT VAR(okbob.myvar);

2. the usage of VAR() function should be required, when query has FROM clause, and then there is in risk of collisions. Without it, then the VAR() function can be optional (it is modification of Wolfgang or Alvaro proposals). I prefer this syntax before mentioning in FROM clause, just I think so it is less confusing, and FROM clause should be used for relations, and not for variables.

SELECT * FROM tab WHERE id = VAR(okbob.myvar)
SELECT okbob.myvar;

3. Outside PL the VAR() function will be required, inside PL the VAR function can be optional (and we can throw an exception) when we found collision like now

What do you think about this proposal? And if you can accept it, what version?

I think so implementation of any proposed variant should be easy. I can add extra check to plpgsql_check if the argument of VAR() function is in possible collision with other identifiers in query, but for proposed variants it is just in nice to have category

Regards

Pavel

 

                        regards, tom lane

Re: Schema variables - new implementation for Postgres 15

From
Dmitry Dolgov
Date:
> On Tue, May 28, 2024 at 05:18:02PM GMT, Pavel Stehule wrote:
>
> I propose another variants. First we can introduce pseudo function VAR( ).
> The argument should be session variables. The name of this function can be
> pgvar, globvar, ... We can talk about good name, it should not be too long,
> but it is not important now. The VAR() function will be pseudo function
> like COALESCE, so we can easily to set correct result type.

So, the purpose of the function would be only to verify that the argument is a
session variable? That seems to be a very light payload, which looks a bit
awkward.

Out of those options you propose I think the first one is the
most straightforward one, but...

> Alvaro Herrera:
> > Perhaps the solution to all this is to avoid having the variables be
> > implicitly present in the range table of all queries.  Instead, if you
> > need a variable's value, then you need to add the variable to the FROM
> > clause;

The more I think about this, the more I like this solution. Marking
which variables are available to the query this way, and using established
patterns for resolving ambiguity actually looks intuitive to me. Now I know,
you've got strong objections:

> I don't like this. Sure, this fixes the problem with collisions, but then
> we cannot talk about variables. When some is used like a table, then it
> should be a table. I can imagine memory tables, but it is a different type
> of object. Table is relation, variable is just value. Variables should not
> have columns, so using the same patterns for tables and variables has no
> sense. Using the same catalog for variables and tables. Variables just hold
> a value, and then you can use it inside a query without necessity to write
> JOIN. Variables are not tables, and then it is not too confusing so they
> are not transactional and don't support more rows, more columns.

A FROM clause could contain a function returning a single value, nobody
finds it confusing. And at least to me it's not much different from having a
session variable as well, what do you think?

> c) using variables with necessity to define it in FROM clause. It is safe,
> but it can be less readable, when you use more variables, and it is not too
> readable, and user friendly, because you need to write FROM. And can be
> messy, because you usually will use variables in queries, and it is
> introduce not relations into FROM clause. But I can imagine this mode as
> alternative syntax, but it is very unfriendly and not intuitive (I think).

The proposal from Wolfgang to have a short-cut and not add FROM in case there
is no danger of ambiguity seems to resolve that.

> More probably it doesn't fast execution in simple expression execution mode.

Could you elaborate more, what do you mean by that? If the performance
overhead is not prohibitive (which I would expect is the case), having better
UX for a new feature usually beats having better performance.

> It looks odd - It is not intuitive, it introduces new inconsistency inside
> Postgres, or with solutions in other databases. No other database has a
> similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird will be
> confused. Users that use PL/pgSQL will be confused.

Session variables are not part of the SQL standard, and maintaining
consistency with other databases is a questionable goal. Since it's a new
feature, I'm not sure what you mean by inconsistency inside Postgres itself.

I see that the main driving case behind this patch is to help with
migrating from other databases that do have session variables. Going with
variables in FROM clause, will not make a migration much harder -- some of the
queries would have to modify the FROM part, and that's it, right? I could
imagine it would be even easier than adding VAR() everywhere.



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 31. 5. 2024 v 11:46 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Tue, May 28, 2024 at 05:18:02PM GMT, Pavel Stehule wrote:
>
> I propose another variants. First we can introduce pseudo function VAR( ).
> The argument should be session variables. The name of this function can be
> pgvar, globvar, ... We can talk about good name, it should not be too long,
> but it is not important now. The VAR() function will be pseudo function
> like COALESCE, so we can easily to set correct result type.

So, the purpose of the function would be only to verify that the argument is a
session variable? That seems to be a very light payload, which looks a bit
awkward.

no, it just reduces catalog searching to variables. So with using this function, then there is no possibility of collision between variables and other objects. The argument can be only variable and nothing else. So then the conflict is not possible. When somebody tries to specify a table or column, then it fails, because this object will not be detected. So inside this function, the tables and columns cannot to shading variables, and variables cannot be replaced by columns.

So the proposed function is not just assert, it is designed like a catalog filter.


Out of those options you propose I think the first one is the
most straightforward one, but...

> Alvaro Herrera:
> > Perhaps the solution to all this is to avoid having the variables be
> > implicitly present in the range table of all queries.  Instead, if you
> > need a variable's value, then you need to add the variable to the FROM
> > clause;

The more I think about this, the more I like this solution. Marking
which variables are available to the query this way, and using established
patterns for resolving ambiguity actually looks intuitive to me. Now I know,
you've got strong objections:

I still don't like this - mainly from two reasons

1. it doesn't look user friendly - you need to maintain two different places in one query for one object.  I can imagine usage there in the case of composite variables with unpacking (and then it can be consistent with others). I can imagine to use optional usage of variables there for the possibility of realiasing - like functions - and if we should support it, then with unpacking of composite values.

(2024-05-31 12:33:57) postgres=# create type t as (a int, b int);
CREATE TYPE
(2024-05-31 12:35:26) postgres=# create function fx() returns t as $$ select 1, 2 $$ language sql;
CREATE FUNCTION
(2024-05-31 12:35:44) postgres=# select fx();
┌───────┐
│  fx   │
╞═══════╡
│ (1,2) │
└───────┘
(1 row)

(2024-05-31 12:35:47) postgres=# select * from fx();
┌───┬───┐
│ a │ b │
╞═══╪═══╡
│ 1 │ 2 │
└───┴───┘
(1 row)

2. But my main argument is, it is not really safe - it solves Peter's use case, but if I use a reverse example of Peter's case, I still have a problem.

I can have a variable x, and then I can write query like `SELECT x FROM x`;

but if somebody creates table x(x int), then the query `SELECT x FROM x` will be correct, but it is surely something else. So the requirement of the usage variable inside FROM clause doesn't help. It doesn't work.







> I don't like this. Sure, this fixes the problem with collisions, but then
> we cannot talk about variables. When some is used like a table, then it
> should be a table. I can imagine memory tables, but it is a different type
> of object. Table is relation, variable is just value. Variables should not
> have columns, so using the same patterns for tables and variables has no
> sense. Using the same catalog for variables and tables. Variables just hold
> a value, and then you can use it inside a query without necessity to write
> JOIN. Variables are not tables, and then it is not too confusing so they
> are not transactional and don't support more rows, more columns.

A FROM clause could contain a function returning a single value, nobody
finds it confusing. And at least to me it's not much different from having a
session variable as well, what do you think?

but there is a difference when function returns composite, and when not - if I use function in FROM clause, I'll get unpacked columns, when I use function in columns, then I get composite.

The usage variable in FROM clause can have sense in similar princip like functions - for possibility to use alias in same level of query and possibility to use one common syntax for composite unpacking. But it doesn't help with safety against collisions.
 

> c) using variables with necessity to define it in FROM clause. It is safe,
> but it can be less readable, when you use more variables, and it is not too
> readable, and user friendly, because you need to write FROM. And can be
> messy, because you usually will use variables in queries, and it is
> introduce not relations into FROM clause. But I can imagine this mode as
> alternative syntax, but it is very unfriendly and not intuitive (I think).

The proposal from Wolfgang to have a short-cut and not add FROM in case there
is no danger of ambiguity seems to resolve that.

> More probably it doesn't fast execution in simple expression execution mode.

Could you elaborate more, what do you mean by that? If the performance
overhead is not prohibitive (which I would expect is the case), having better
UX for a new feature usually beats having better performance.

PLpgSQL has a special mode for faster expression execution. One prerequisite is not using FROM clause.


> It looks odd - It is not intuitive, it introduces new inconsistency inside
> Postgres, or with solutions in other databases. No other database has a
> similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird will be
> confused. Users that use PL/pgSQL will be confused.

Session variables are not part of the SQL standard, and maintaining
consistency with other databases is a questionable goal. Since it's a new
feature, I'm not sure what you mean by inconsistency inside Postgres itself.

I see that the main driving case behind this patch is to help with
migrating from other databases that do have session variables. Going with
variables in FROM clause, will not make a migration much harder -- some of the
queries would have to modify the FROM part, and that's it, right? I could
imagine it would be even easier than adding VAR() everywhere.

I don't think - VAR(x) instead x is just a simple replacement - searching related FROM clauses is much more complex work.

and if we talk about safety against collisions,  then FROM clause doesn't help. Moreover, this safety is not guaranteed today because we have a search patch and we support unqualified identifiers.

Regards

Pavel
 

Re: Schema variables - new implementation for Postgres 15

From
Wolfgang Walther
Date:
Pavel Stehule:
> 2. But my main argument is, it is not really safe - it solves Peter's 
> use case, but if I use a reverse example of Peter's case, I still have a 
> problem.
> 
> I can have a variable x, and then I can write query like `SELECT x FROM x`;
> 
> but if somebody creates table x(x int), then the query `SELECT x FROM x` 
> will be correct, but it is surely something else. So the requirement of 
> the usage variable inside FROM clause doesn't help. It doesn't work.

But in this case you could make variables and tables share the same 
namespace, i.e. forbid creating a variable with the same name as an 
already existing table.

Best,

Wolfgang



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 31. 5. 2024 v 13:10 odesílatel Wolfgang Walther <walther@technowledgy.de> napsal:
Pavel Stehule:
> 2. But my main argument is, it is not really safe - it solves Peter's
> use case, but if I use a reverse example of Peter's case, I still have a
> problem.
>
> I can have a variable x, and then I can write query like `SELECT x FROM x`;
>
> but if somebody creates table x(x int), then the query `SELECT x FROM x`
> will be correct, but it is surely something else. So the requirement of
> the usage variable inside FROM clause doesn't help. It doesn't work.

But in this case you could make variables and tables share the same
namespace, i.e. forbid creating a variable with the same name as an
already existing table.

It helps, but not on 100% - there is a search path

 

Best,

Wolfgang

Re: Schema variables - new implementation for Postgres 15

From
Wolfgang Walther
Date:
Pavel Stehule:
>     But in this case you could make variables and tables share the same
>     namespace, i.e. forbid creating a variable with the same name as an
>     already existing table.
> 
> 
> It helps, but not on 100% - there is a search path

I think we can ignore the search_path for this discussion. That's not a 
problem of variables vs tables, but just a search path related problem. 
It is exactly the same thing right now, when you create a new table x(x) 
in a schema which happens to be earlier in your search path.

The objection to the proposed approach for variables was that it would 
introduce *new* ambiguities, which Alvaro's suggestion avoids.

Best,

Wolfgang



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 31. 5. 2024 v 13:37 odesílatel Wolfgang Walther <walther@technowledgy.de> napsal:
Pavel Stehule:
>     But in this case you could make variables and tables share the same
>     namespace, i.e. forbid creating a variable with the same name as an
>     already existing table.
>
>
> It helps, but not on 100% - there is a search path

I think we can ignore the search_path for this discussion. That's not a
problem of variables vs tables, but just a search path related problem.
It is exactly the same thing right now, when you create a new table x(x)
in a schema which happens to be earlier in your search path.
 
I don't think it is a valid argument - search_path is there, and we cannot ignore it, because it allows just one case.

And the need to use a variable in FROM clause introduces implicit unpacking or inconsistency with current work with composite's types, so I am more sure this way is not good.


 

The objection to the proposed approach for variables was that it would
introduce *new* ambiguities, which Alvaro's suggestion avoids.

Best,

Wolfgang

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 31. 5. 2024 v 15:02 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


pá 31. 5. 2024 v 13:37 odesílatel Wolfgang Walther <walther@technowledgy.de> napsal:
Pavel Stehule:
>     But in this case you could make variables and tables share the same
>     namespace, i.e. forbid creating a variable with the same name as an
>     already existing table.
>
>
> It helps, but not on 100% - there is a search path

I think we can ignore the search_path for this discussion. That's not a
problem of variables vs tables, but just a search path related problem.
It is exactly the same thing right now, when you create a new table x(x)
in a schema which happens to be earlier in your search path.
 
I don't think it is a valid argument - search_path is there, and we cannot ignore it, because it allows just one case.

And the need to use a variable in FROM clause introduces implicit unpacking or inconsistency with current work with composite's types, so I am more sure this way is not good.

The session variables can be used in queries, but should be used in PL/pgSQL expressions, and then the mandatory usage in FROM clause will do lot of problems and unreadable code like

DO $$
BEGIN
  RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y);

END
$$

This requirement does variables unusable in PL




 

The objection to the proposed approach for variables was that it would
introduce *new* ambiguities, which Alvaro's suggestion avoids.

Best,

Wolfgang

Re: Schema variables - new implementation for Postgres 15

From
Wolfgang Walther
Date:
Pavel Stehule:
> The session variables can be used in queries, but should be used in 
> PL/pgSQL expressions, and then the mandatory usage in FROM clause will 
> do lot of problems and unreadable code like
> 
> DO $$
> BEGIN
>    RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y);
> 
> END
> $$
> 
> This requirement does variables unusable in PL

I already proposed earlier to only require listing them in FROM when 
there is actually a related FROM.

In this case you could still write:

RAISE NOTICE '% %', x, (SELECT a,b FROM y);

(assuming only x is a variable here)

Best,

Wolfgang



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 31. 5. 2024 v 15:29 odesílatel Wolfgang Walther <walther@technowledgy.de> napsal:
Pavel Stehule:
> The session variables can be used in queries, but should be used in
> PL/pgSQL expressions, and then the mandatory usage in FROM clause will
> do lot of problems and unreadable code like
>
> DO $$
> BEGIN
>    RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y);
>
> END
> $$
>
> This requirement does variables unusable in PL

I already proposed earlier to only require listing them in FROM when
there is actually a related FROM.

but there is technical problem - plpgsql expression are internally SQL queries. Isn't possible to cleanly to parse queries and expressions differently.

 

In this case you could still write:

RAISE NOTICE '% %', x, (SELECT a,b FROM y);

(assuming only x is a variable here)

Best,

Wolfgang

Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:



In this case you could still write:

RAISE NOTICE '% %', x, (SELECT a,b FROM y);

(assuming only x is a variable here)

no - y was a composite variable. 

When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to RAISE NOTICE '%', SELECT $1 

There is no parser just for expressions.

 

Best,

Wolfgang

Re: Schema variables - new implementation for Postgres 15

From
Wolfgang Walther
Date:
Pavel Stehule:
> When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to 
> RAISE NOTICE '%', SELECT $1
> 
> There is no parser just for expressions.

That's why my suggestion in [1] already made a difference between:

SELECT var;

and

SELECT col, var FROM table, var;

So the "only require variable-in-FROM if FROM is used" should extend to 
the SQL level.

That should be possible, right?

Best,

Wolfgang

[1]: 
https://www.postgresql.org/message-id/e7faf42f-62b8-47f4-af5c-cb8efa3e0e20%40technowledgy.de



Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


pá 31. 5. 2024 v 15:49 odesílatel Wolfgang Walther <walther@technowledgy.de> napsal:
Pavel Stehule:
> When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to
> RAISE NOTICE '%', SELECT $1
>
> There is no parser just for expressions.

That's why my suggestion in [1] already made a difference between:

SELECT var;

and

SELECT col, var FROM table, var;

So the "only require variable-in-FROM if FROM is used" should extend to
the SQL level.

That should be possible, right?

1. you need to implement extra path - the data from FROM clause are processed differently than params  - it is much more code (and current code should to stay if you want to support it)

2. current default behave is implicit unpacking of composites when are used in FROM clause. So it is problem when you want to use composite in query without unpacking

3. when I'll support SELECT var and SELECT var FROM var together, then it will raise a collision with self, that should be solved

4. there is not any benefit if variables and tables doen't share catalog, but session variables requires lsn number, and it can be problem to use it is table catalog

5. identification when the variable needs or doesn't need FROM clause isn't easy

there can be lot of combinations like SELECT (SELECT var), c FROM tab  or SELECT var, (SELECT c) FROM c and if c is variable, then FROM is not necessary.

If somebody will write SELECT (SELECT var OFFSET 0) FROM ... then subselect can know nothing about outer query - so it means minimally one check over all nodes

It is possible / but it is multiple more complex than current code (and I am not sure if store lns in pg_class is possible ever)

6. I think so plpgsql case statement use multicolumn expression, so you can write

CASE WHEN x = 1, (SELECT count(*) FROM tab) THEN ...

It is synthetic, but we are talking about what is possible.

and although it looks correctly, and will work if x will be plpgsql variable, then it will not work if x will be session variable

and then you need to fix it like

CASE WHEN (SELECT x=1 FROM x), (SELECT count(*) FROM tab) THEN

so it is possible, but it is clean only in trivial cases, and can be pretty messy

Personally, I cannot to imagine to explain to any user so following (proposed by you) behaviour is intuitive and friendly

CREATE VARIABLE a as int;
CREATE TABLE test(id int);

SELECT a; --> ok
SELECT * FROM test WHERE id = a; -- error message "the column "a" doesn't exists"


Re: Schema variables - new implementation for Postgres 15

From
Peter Eisentraut
Date:
On 25.05.24 12:50, Pavel Stehule wrote:
> It looks odd - It is not intuitive, it introduces new inconsistency 
> inside Postgres, or with solutions in other databases. No other database 
> has a similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird 
> will be confused. Users that use PL/pgSQL will be confused.

Do you have a description of what those other systems do?  Maybe you 
posted it already earlier?




Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:


ne 2. 6. 2024 v 23:31 odesílatel Peter Eisentraut <peter@eisentraut.org> napsal:
>
> On 25.05.24 12:50, Pavel Stehule wrote:
> > It looks odd - It is not intuitive, it introduces new inconsistency
> > inside Postgres, or with solutions in other databases. No other database
> > has a similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird
> > will be confused. Users that use PL/pgSQL will be confused.
>
> Do you have a description of what those other systems do?  Maybe you
> posted it already earlier?
>

I checked today

1. MySQL

MySQL knows 3 types of variables

global - the access syntax is @@varname - they are used like our GUC and only buildin system variables are supported

SET @@autocommit = off;
SELECT @@autocommit;

user defined variables - the access syntax is @varname - the behaviour is similar to psql variables, but they are server side

SET @x = 100;
SELECT @x;

local variables - only inside PL

CREATE PROCEDURE p1()
DECLARE x int;
BEGIN
  SET x = 100;
  SELECT x;
END;

variables has higher priority than column (like old plpgsql)

2. MSSQL

global variables - the access syntax is @@varname, they are used like GUC and little bit more - some state informations are there like @@ERROR, @@ROWCOUNT or @@IDENTITY

local variables - the access syntax is @varname, and should be declared before usage by DECLARE command. The scope is limited to batch or procedure or function, where DECLARE command was executed.

DECLARE @TestVariable AS VARCHAR(100)
SET @TestVariable = 'Think Green'
GO
PRINT @TestVariable

This script fails, because PRINT is executed in another batch. So I think so MSSQL doesn't support session variables

There are similar mechanisms like our custom GUC and usage current_setting and set_config functions. Generally, in this area is MSSQL very primitive

EXEC sp_set_session_context 'user_id', 4;  
SELECT SESSION_CONTEXT(N'user_id');

3. DB2

The "user defined global variables" are similar to my proposal. The differences are different access rights "READ, WRITE" x "SELECT, UPDATE". Because PostgreSQL has SET command for GUC, I introduced LET command (DB2 uses SET)

Variables are visible in all sessions, but value is private per session. Variables are not transactional. The usage is wider than my proposal. Then can be changed by commands SET, SELECT INTO or they can be used like OUT parameters of procedures. The search path (or some like that) is used for variables too, but the variables has less priority than tables/columns.

CREATE VARIABLE myCounter INT DEFAULT 01;
SELECT EMPNO, LASTNAME, CASE WHEN myCounter = 1 THEN SALARY ELSE NULL END FROM EMPLOYEE WHERE WORKDEPT = ’A00’;
SET myCounter = 29;

There are (I think) different kinds of variables - accessed by the function GETVARIABLE('name', 'default) - it looks very similar ro our GUC and `current_setting` function. These variables can be set by connection string, are of varchar type and 10 values are allowed. Built-in session variables (configuration) can be accessed by the function GETVARIABLE too.

SQL stored procedures supports declared local variables like PL/pgSQL

4. Firebird

Firebird has something like our custom GUC. But it allow nested routines - so some functionality of session variables can be emulated with local variable and nested routines (but outer variables can be used only in Firebird 5)

The variables are accessed by syntax :varname - like psql, but if I understand to diagrams, the char ':' is optional

5. SQL/PSM

Standard introduces a concept of modules that can be joined with schemas. The variables are like PLpgSQL, but only local - the only temp tables can be defined on module levels. These tables can be accessed only from routines assigned to modules. Modules are declarative versions of our extensions (if I understand well, I didn't find any implementation). It allows you to overwrite the search patch for routines assigned in the module. Variables are not transactional, the priority - variables/columns is not specified.

6. Oracle

Oracle PL/SQL allows the use of package variables. PL/SQL is +/- ADA language - and package variables are "global" variables. They are not directly visible from SQL, but Oracle allows reduced syntax for functions without arguments, so you need to write a wrapper

CREATE OR REPLACE PACKAGE my_package
AS
    FUNCTION get_a RETURN NUMBER;
END my_package;
/

CREATE OR REPLACE PACKAGE BODY my_package
AS
    a  NUMBER(20);

    FUNCTION get_a
    RETURN NUMBER
    IS
    BEGIN
      RETURN a;
    END get_a;
END my_package;

SELECT my_package.get_a FROM DUAL;

Inside SQL the higher priority has SQL, inside non SQL commands like CALL or some PL/SQL command, the higher priority has packages.

The Oracle allows both syntax for calling function with zero arguments so

SELECT my_package.get_a FROM DUAL;

or

SELECT my_package.get_a() FROM DUAL;

Then there is less risk reduction of collision. Package variables persist in session

Another possibility is using variables in SQL*Plus (looks like our psql variables, with possibility to define type on server side)

The variable should be declared by command VARIABLE and can be accessed by syntax :varname in session before usage (maybe this step is optional)

VARIABLE bv_variable_name VARCHAR2(30)

BEGIN
  :bv_variable_name := 'Some Value';
END;

SELECT column_name
FROM   table_name
WHERE  column_name = :bv_variable_name;

This is something between MSSQL and MYSQL session variables - but internally it is binding parameters - what I know, Postgres cannot set these parameters as result of some pg operation.

SQL*Plus is strange creature


Generally, the possible collision between variables and columns are solved by

a) special syntax - using prefix like @ or :
b) dedicated functions
c) variables has lower priority than columns

You can see, the RDBMS allows different types of session variables, different implementations. Usually one system allows more implementation of session variables. There is a possibility of emulation implementation between RDBMS, but security setting is possible only in Oracle or DB2.

Regards

Pavel






Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:
Hi


You can see, the RDBMS allows different types of session variables, different implementations. Usually one system allows more implementation of session variables. There is a possibility of emulation implementation between RDBMS, but security setting is possible only in Oracle or DB2.

MySQL concept is very handy for ad hoc work, but it is too primitive for secure or safe use in stored procedures.

Oracle concept is safe, but needs packages, needs writing wrappers, needs PL/SQL.

I designed a concept that is very similar to DB2 (independently on IBM), and I think it is strong and can be well mapped to PostgreSQL (no packages, more different PL, strongly typed, ...)

I think it would be nice to support the MySQL concept as syntactic sugar for GUC. This can be easy and for some use cases really very handy (and less confusing for beginners - using set_confing and current_setting is intuitive for work (emulation) of session variables (although the MSSQL solution is less intuitive).

SET @myvar TO 10; --> SELECT set_config('session.myvar', 10)
SET @@work_mem TO '10MB'; --> SELECT set_config('work_mem', '10MB');
SELECT @myvar; --> SELECT current_setting('session.myvar');
SELECT @@work_mem; --> SELECT current_setting('work_mem');

The syntax @ and @@ is widely used, and the mapping can be simple. This my proposal is not a replacement of the proposal of "schema" session variables. It is another concept, and I think so both can live together very well, because they are used for different purposes. Oracle, DB2 supports +/- both concepts

Regards

Pavel



 

Regards

Pavel






Re: Schema variables - new implementation for Postgres 15

From
Pavel Stehule
Date:

6. Oracle

Oracle PL/SQL allows the use of package variables. PL/SQL is +/- ADA language - and package variables are "global" variables. They are not directly visible from SQL, but Oracle allows reduced syntax for functions without arguments, so you need to write a wrapper

CREATE OR REPLACE PACKAGE my_package
AS
    FUNCTION get_a RETURN NUMBER;
END my_package;
/

CREATE OR REPLACE PACKAGE BODY my_package
AS
    a  NUMBER(20);

    FUNCTION get_a
    RETURN NUMBER
    IS
    BEGIN
      RETURN a;
    END get_a;
END my_package;

SELECT my_package.get_a FROM DUAL;

Inside SQL the higher priority has SQL, inside non SQL commands like CALL or some PL/SQL command, the higher priority has packages.

The risk of collision's identifier is in some PL/SQL statements less than in Postgres, because SQL can be used only on dedicated positions (minimally in older Oracle's versions). Against other databases there is not allowed to use SQL everywhere as an expression. PL/SQL is an independent language, environment with its own expression executor (compiler). Other databases allow you to use an SQL subselect (I tested MySQL,  PL/pgSQL, and I think (if I remember docs well) it is in standard SQL/PSM (related part of ANSI/SQL)) as expression. The integration of SQL into PL/SQL is not too deep and stored procedures look more like client code executed on the server side.

Regards

Pavel