Re: Schema variables - new implementation for Postgres 15 - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Schema variables - new implementation for Postgres 15
Date
Msg-id 20220126072306.xxkfqma6rwfdkxff@jrouhaud
Whole thread Raw
In response to Re: Schema variables - new implementation for Postgres 15  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Schema variables - new implementation for Postgres 15
Re: Schema variables - new implementation for Postgres 15
List pgsql-hackers
Hi,

On Tue, Jan 25, 2022 at 10:53:00PM +0100, Pavel Stehule wrote:
> 
> út 25. 1. 2022 v 6:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> >
> > First, I don't think that acquiring the lock in
> > get_session_variable_type_typmod_collid() and
> > prepare_variable_for_reading() is
> > the correct approach.  In transformColumnRef() and transformLetStmt() you
> > first
> > call IdentifyVariable() to check if the given name is a variable without
> > locking it and later try to lock the variable if you get a valid Oid.
> > This is
> > bug prone as any other backend could drop the variable between the two
> > calls
> > and you would end up with a cache lookup failure.  I think the lock should
> > be
> > acquired during IdentifyVariable.  It should probably be optional as one
> > codepath only needs the information to raise a warning when a variable is
> > shadowed, so a concurrent drop isn't a problem there.
> >
> 
> I moved lock to IdentifyVariable routine

+IdentifyVariable(List *names, char **attrname, bool lockit, bool *not_unique)
+{
[...]
+               return varoid_without_attr;
+           }
+           else
+           {
+               *attrname = c;
+               return varoid_with_attr;
[...]
+
+   if (OidIsValid(varid) && lockit)
+       LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
+
+   return varid;

There are still some code paths that may not lock the target variable when
required.

Also, the function comment doesn't say much about attrname handling, it should
be clarifed.  I think it should initially be set to NULL, to make sure that
it's always a valid pointer after the function returns.


> attached updated patch

Various comments on the patch:

No test for GRANT/REVOKE ... ALL VARIABLES IN SCHEMA, maybe it would be good to
have one?

Documentation:

catalogs.sgml:

You're still using the old-style 4 columns table, it should be a single column
like the rest of the file.

+  <para>
+   The <command>CREATE VARIABLE</command> command creates a session variable.
+   Session variables, like relations, exist within a schema and their access is
+   controlled via <command>GRANT</command> and <command>REVOKE</command>
+   commands.  Changing a session variable is non-transactional.
+  </para>

The "changing a session variable is non-transactional" is ambiguous.  I think
that only the value part isn't transactional, the variable metadata themselves
(ALTER VARIABLE and other DDL) are transactional right?  This should be
explicitly described here (although it's made less ambiguous in the next
paragraph).

+  <para>
+   Session variables are retrieved by the <command>SELECT</command> SQL
+   command.  Their value is set with the <command>LET</command> SQL command.
+   While session variables share properties with tables, their value cannot be
+   updated with an <command>UPDATE</command> command.
+  </para>

should this part mention that session variables can be shadowed?  For now the
only mention to that is in advanced.sgml.

+      The <literal>DEFAULT</literal> clause can be used to assign a default
+      value to a session variable.

The expression is lazily evaluated during the session first use of the
variable.  This should be documented as any usage of volatile expression will
be impacted.

+      The <literal>ON TRANSACTION END RESET</literal>
+      clause causes the session variable to be reset to its default value when
+      the transaction is committed or rolled back.

As far as I can see this clauses doesn't play well with IMMUTABLE VARIABLE, as
you can reassign a value once the transaction ends.  Same for DISCARD [ ALL |
VARIABLES ], or LET var = NULL (or DEFAULT if no default value).  Is that
intended?

+   <literal>LET</literal> extends the syntax defined in the SQL
+   standard. The <literal>SET</literal> command from the SQL standard
+   is used for different purposes in <productname>PostgreSQL</productname>.

I don't fully understand that.  Are (session) variables defined in the SQL
standard?  If yes, all the other documentation pages should clarify that as
they currently say that this is a postgres extension.  If not, this part should
made it clear what is defined in the standard.

In revoke.sgml:
+ REVOKE [ GRANT OPTION FOR ]
+     { { READ | WRITE } [, ...] | ALL [ PRIVILEGES ] }
+     ON VARIABLE <replaceable>variable_name</replaceable> [, ...]
+     FROM { [ GROUP ] <replaceable class="parameter">role_name</replaceable> | PUBLIC } [, ...]
+     [ CASCADE | RESTRICT ]

there's no extra documentation for that, and therefore no clarification on
variable_name.

VariableIsVisible():
+         * If it is in the path, it might still not be visible; it could be
+         * hidden by another relation of the same name earlier in the path. So
+         * we must do a slow check for conflicting relations.

should it be "another variable of the same name"?


Tab completion: CREATE IMMUTABLE VARIABLE is not handled


pg_variable.c:
Do we really need both session_variable_get_name() and
get_session_variable_name()?

+/*
+ * Fetch all fields of session variable from the syscache.
+ */
+void
+initVariable(Variable *var, Oid varid, bool missing_ok, bool fast_only)

As least fast_only should be documented in the function comment, especially
regarding var->varname, since:

+   var->oid = varid;
+   var->name = pstrdup(NameStr(varform->varname));
[...]
+   if (!fast_only)
+   {
+       Datum       aclDatum;
+       bool        isnull;
+
+       /* name */
+       var->name = pstrdup(NameStr(varform->varname));A
[...]
+   else
+   {
+       var->name = NULL;

is the output value guaranteed or not?  In any case it shouldn't be set twice.

Also, I don't see any caller for missing_ok == true, should we remove it?

+/*
+ * Create entry in pg_variable table
+ */
+ObjectAddress
+VariableCreate(const char *varName,
[...]
+   /* dependency on any roles mentioned in ACL */
+   if (varacl != NULL)
+   {
+       int         nnewmembers;
+       Oid        *newmembers;
+
+       nnewmembers = aclmembers(varacl, &newmembers);
+       updateAclDependencies(VariableRelationId, varid, 0,
+                             varOwner,
+                             0, NULL,
+                             nnewmembers, newmembers);

Shouldn't you use recordDependencyOnNewAcl() instead?  Also, sn't it missing a
recordDependencyOnOwner()?

sessionvariable.c:

+ * Although session variables are not transactional, we don't
+ * want (and we cannot) to run cleaning immediately (when we
+ * got sinval message). The value of session variables can
+ * be still used or the operation that emits cleaning can be
+ * reverted. Unfortunatelly, this check can be done only in
+ * when transaction is committed (the check against system
+ * catalog requires transaction state).

This was the original idea, but since there's now locking to make all DDL safe,
the metadata should be considered fully transactional and no session should
still be able to use a concurrently dropped variable.  Also, the invalidation
messages are not sent until the transaction is committed.  So is that approach
still needed (at least for things outside ON COMMIT DROP / ON TRANSACTION END
RESET)?

I'm also attaching a 3rd patch with some proposition for documentation
rewording (including consistent use of *session* variable), a few comments
rewording, copyright year bump and minor things like that.

Note that I still didn't really review pg_variable.c or sessionvariable.c since
there might be significant changes there for either the sinval / immutable part
I mentioned.

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side
Next
From: Michail Nikolaev
Date:
Subject: Re: [PATCH] Full support for index LP_DEAD hint bits on standby