Re: proposal: schema variables - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: proposal: schema variables
Date
Msg-id CAFj8pRDXo-RRcy2VFDm_vzv3Eaaz6Ex=X19up=x8W4COyBNmaQ@mail.gmail.com
Whole thread Raw
In response to Re: proposal: schema variables  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: proposal: schema variables
List pgsql-hackers
Hi

út 23. 7. 2024 v 16:34 odesílatel Laurenz Albe <laurenz.albe@cybertec.at> napsal:
Thanks for the fixes and the new patch set!
I think that this would be a very valuable feature!


Thank you :-)
 
This is a very incomplete review after playing with the patch set for a while.

Some bugs and oddities I have found:

"psql" support:

  \? gives

    \dV     [PATTERN]      list variables

  I think that should say "schema variables" to distinguish them from
  psql variables.

enhanced to "list session variables"
 

  Running \dV when connected to an older server gives

    ERROR:  relation "pg_catalog.pg_variable" does not exist
    LINE 16: FROM pg_catalog.pg_variable v
                  ^

  I think it would be better not to run the query and show a response like

    session variables don't exist in server version 16

there is standardized error message - and I used now

<-->if (pset.sversion < 180000)
<-->{
<--><-->char<--><-->sverbuf[32];

<--><-->pg_log_error("The server (version %s) does not support session variables.",
<--><--><--><--><--> formatPGVersionNumber(pset.sversion, false,
<--><--><--><--><--><--><--><--><--><-->   sverbuf, sizeof(sverbuf)));
<--><-->return true;
<-->}
 

The LET statement:

    CREATE VARIABLE testvar AS int4multirange[];
    LET testvar = '{\{[2\,7]\,[11\,13]\}}';
    ERROR:  variable "laurenz.testvar" is of type int4multirange[], but expression is of type text
    LINE 1: LET testvar = '{\{[2\,7]\,[11\,13]\}}';
                          ^
    HINT:  You will need to rewrite or cast the expression.

  Sure, I can add an explicit type cast, but I think that the type should
  be determined by the type of the variable.  The right-hand side should be
  treated as "unknown", and the type input function should be used.

fixed - it needed set pstate->p_resolve_unknowns = false;

I used your example in regress test


Parameter session_variables_ambiguity_warning:

    --- a/src/backend/utils/misc/guc_tables.c
    +++ b/src/backend/utils/misc/guc_tables.c
    @@ -1544,6 +1544,16 @@ struct config_bool ConfigureNamesBool[] =
            false,
            NULL, NULL, NULL
        },
    +   {
    +       {"session_variables_ambiguity_warning", PGC_USERSET, DEVELOPER_OPTIONS,
    +           gettext_noop("Raise warning when reference to a session variable is ambiguous."),
    +           NULL,
    +           GUC_NOT_IN_SAMPLE
    +       },
    +       &session_variables_ambiguity_warning,
    +       false,
    +       NULL, NULL, NULL
    +   },

  I think the short_desc should be "Raise a warning" (with the indefinite article).

  DEVELOPER_OPTIONS is the wrong category.  We normally use that for parameters
  that are only relevant for PostgreSQL hackers.  I think it should be
  CLIENT_CONN_OTHER.

 changed


CREATE VARIABLE command:

    CREATE VARIABLE str AS text NOT NULL;
    ERROR:  session variable must have a default value, since it's declared NOT NULL

  Perhaps this error message would be better:

    session variables declared as NOT NULL must have a default value

changed
 

  This is buggy:

    CREATE VARIABLE str AS text NOT NULL DEFAULT NULL;

  Ugh.

    SELECT str;
    ERROR:  null value is not allowed for NOT NULL session variable "laurenz.str"
    DETAIL:  The result of DEFAULT expression is NULL.

  Perhaps that is a leftover from the previous coding, but I think there need be
  no check upon SELECT.  It should be enough to check during CREATE VARIABLE and
  LET.

I think it is correct. When you use SELECT str, then DEFAULT expression is executed, and then the result is assigned to a variable, and there is NOT NULL guard, which raises an exception. The variable is not initialized when you run DDL, but it is initialized when you first read or write from/to the variable. The DEFAULT expression is not evaluated in DDL time. In this case, I can detect the problem in DDL time because the result is transformed to NULL node, but generally there can be SQL non immutable function, and then I need to wait until the DEFAULT expression will be evaluated - and it is time of first reading. Unfortunately, there is not an available check if some expression is NULL, that I can use in DDL time, but I have it in plpgsql_check.

You can have a function

CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
BEGIN
  RETURN NULL;
END:
$$ LANGUAGE plpgsql;

the function is not marked as IMMUTABLE

I can

CREATE VARIABLE v AS int NOT NULL DEFAULT foo();

In DDL time I know nothing about result of foo()

When I run SELECT v; then foo is executed, and because it is in contradiction with the NOT NULL clause, it fails. And I think it is correct.

It is consistent with PL/pgSQL. I can write

(2024-07-24 11:57:33) postgres=# CREATE OR REPLACE FUNCTION fx()
postgres-# RETURNS void AS $$
postgres$# DECLARE v int NOT NULL DEFAULT NULL;
postgres$# BEGIN
postgres$# END;
postgres$# $$ LANGUAGE plpgsql;
CREATE FUNCTION
(2024-07-24 12:06:54) postgres=# SELECT fx();
ERROR:  null value cannot be assigned to variable "v" declared NOT NULL
CONTEXT:  PL/pgSQL function fx() line 2 during statement block local variable initialization

DDL is ok, and SELECT fails.

I think so DDL should not to evaluate DEFAULT expressions - the result from this are issues described in discussion about usage 'now'::timestamp
 

pg_dump support:

  The attempt to dump a database with an older version leads to

    pg_dump: error: query failed: ERROR:  relation "pg_catalog.pg_variable" does not exist
    LINE 14: FROM pg_catalog.pg_variable v
                  ^
  Dumping variables must be conditional on the server version.



fixed, there was guard but for pg 17

 

IMMUTABLE variables:

    +   <varlistentry id="sql-createvariable-immutable">
    +    <term><literal>IMMUTABLE</literal></term>
    +    <listitem>
    +     <para>
    +      The assigned value of the session variable can not be changed.
    +      Only if the session variable doesn't have a default value, a single
    +      initialization is allowed using the <command>LET</command> command. Once
    +      done, no further change is allowed until end of transaction
    +      if the session variable was created with clause <literal>ON TRANSACTION
    +      END RESET</literal>, or until reset of all session variables by
    +      <command>DISCARD VARIABLES</command>, or until reset of all session
    +      objects by command <command>DISCARD ALL</command>.
    +     </para>
    +    </listitem>
    +   </varlistentry>

  I can see the usefulness of IMMUTABLE variables, but I am surprised that
  they are reset by DISCARD.  What is the use case you have in mind?
  The use case I can envision is an application that sets a value right after
  authentication, for use with row-level security.  But then it would be harmful
  if the user could reset the variable with DISCARD.

Primary I think about IMMUTABLE variables like about some form of cache. This cache is protected against unwanted repeated write - and can help with detection of this situation.
It is not a security tool primarily - there are access rights for this purpose. The scope of this cache can be different - sometimes session, sometimes transaction. When you use a pooler like pgbouncer, then the lifetime is ended by command DISCARD ALL. Moreover, after DISCARD ALL, the session should be in default state, so then any session variable should be not initialized. And there is an implementation reason too. When I handle command DISCARD VARIABLES, I just reset related memory contexts. It is quick and safe.

I can imagine more strict behaviour - like impossibility to discard context or necessity to have default immutable expressions. But then this clause is not useful for connection pooling with transaction mode :-/. So current design is a compromise - if somebody wants strict behaviour, then can set default immutable expressions (by self). And then there will not be a problem with DISCARD VARIABLES.

But because it supports a possibility ON TRANSACTION END RESET, then theoretically it can be used in transaction mode too without possibility to be reset by DISCARD ALL.

At the end there is just one question - the DISCARD VARIABLES should support some exceptions, or the immutability of variables should have some exceptions? I can imagine the surprise so after DISCARD ALL, the immutable variable lost a value, but the same surprise can be in the moment after DISCARD ALL when the LET immutablevar statement fails. Then we have to choose a priority - DISCARD x IMMUTABLE.

I prefer to have a stronger DISCARD VARIABLES command without exceptions (for semantic, for simple implementation), but I am open to discussion. If we want to support the IMMUTABLE clause, then we have to reply to the mentioned question. What do you think about it?

 

Documentation:
RestoreArchive
    --- a/doc/src/sgml/ddl.sgml
    +++ b/doc/src/sgml/ddl.sgml

    +   <para>
    +    The session variables can be shadowed by column references in a query. When
    +    a query contains identifiers or qualified identifiers that could be used as
    +    both a session variable identifiers and as column identifier, then the
    +    column identifier is preferred every time. Warnings can be emitted when
    +    this situation happens by enabling configuration parameter <xref
    +    linkend="guc-session-variables-ambiguity-warning"/>. User can explicitly
    +    qualify the source object by syntax <literal>table.column</literal> or
    +    <literal>variable.column</literal>.
    +   </para>

  I think you mean <literal>schema.variable</literal>, not <literal>variable.column</literal>.

fixed
 



Yours,
Laurenz Albe
Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Built-in CTYPE provider
Next
From: Peter Eisentraut
Date:
Subject: Re: Built-in CTYPE provider