Re: Schema variables - new implementation for Postgres 15 - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Schema variables - new implementation for Postgres 15
Date
Msg-id CAFj8pRAt=7q5QjaoH52tcV2z6061ZHnUBJukhy+2aWtrFXySUQ@mail.gmail.com
Whole thread Raw
In response to Re: Schema variables - new implementation for Postgres 15  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Schema variables - new implementation for Postgres 15
List pgsql-hackers
Hi


po 22. 8. 2022 v 9:33 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi Pavel,

On Sun, Aug 21, 2022 at 09:54:03AM +0200, Pavel Stehule wrote:
>
> should be fixed now

I started reviewing the patchset, beginning with 0001 (at least the parts that
don't substantially change later) and have a few comments.

- you define new AclMode READ and WRITE.  Those bits are precious and I don't
  think it's ok to consume 2 bits for session variables, especially since those
  are the last two bits available since the recent GUC access control patch
  (ACL_SET and ACL_ALTER_SYSTEM).  Maybe we could existing INSERT and UPDATE
  privileges instead, like it's done for sequences?

changed - now ACL_SELECT and ACL_UPDATE are used
 

- make check and make-check-world don't pass with this test only.  Given that
  the split is mostly done to ease review and probably not intended to be
  committed this way, we probably shouldn't spend efforts to clean up the split
  apart from making sure that each patch compiles cleanly on its own.  But in
  this case it brought my attention to misc_sanity.sql test.  Looking at patch
  0010, I see:

diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index a57fd142a9..ce9bad7211 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -60,7 +60,9 @@ ORDER BY 1, 2;
  pg_index                | indpred       | pg_node_tree
  pg_largeobject          | data          | bytea
  pg_largeobject_metadata | lomacl        | aclitem[]
-(11 rows)
+ pg_variable             | varacl        | aclitem[]
+ pg_variable             | vardefexpr    | pg_node_tree
+(13 rows)

This is the test for relations with varlena columns without TOAST table.  I
don't think that's correct to add those exceptions, and there should be a TOAST
table declared for pg_variable too, as noted in the comment above that query.

- nitpicking: s/catalogue/catalog/

Some other comments on other patches while testing things around:

fixed
 

- For sessionvariable.c (in 0002), I see that there are still all the comments
  and code about checking type validity based on a generation number and other
  heuristics.  I still fail to understand why this is needed at all as the
  stored datum should remain compatible as long as we prevent the few
  incompatible DDL that are also prevented when there's a relation dependency.
  As an example, I try to quickly disable all that code with the following:

I am not able to test (in this situation) the situation where gennum is increased, but I think it is possible, and there are few situations where dependency is not enough. But maybe my thoughts are too pessimistic, and this aparate is not necessary.

1. update of binary custom type - the dependency allows an extension update, and after update the binary format can be changed. Now I think this part is useless, because although the extension can be updated, the dll cannot be unloaded, so the loaded implementation of custom session type will be the same until session end.

2. altering composite type - the generation number reduces overhead with checking compatibility of stored value and expected value. With gennum I need to run compatibility checks just once per transaction. When the gennum is the same, I can return data without any conversion. 

3. I try to use gennum for detection of oid overflow. The value is stored in the session memory context in the hash table. The related memory can be cleaned at transaction end (when memory is deleted) and when I can read system catalog (transaction is not aborted). When a transaction is aborted, then I cannot read the system catalog, and I have to postpone cleaning to the next usage of the session variable. Theoretically, the session can be inactive for a longer time and the system catalog can be changed a lot (and the oid counter can be restarted). 

I am checking:

3.1 if variable with oid still exists

3.2 if the variable has assigned type with same oid

3.3. if type fingerprint is same - and I can expect so the type with same oid is same type

3.2 and 3.3 are safe guard for cases where oid is restarted, and I cannot believe the consistency of values stored in memory.

This is a very different situation than for example temporary tables. Every temp table for every session has its own entry in the system catalog, so protection based on dependency can work. But record of session variable is shared - It is protected inside transaction, but session variables are living in session. Without transaction there is not any lock on the item in pg_variable, so I can drop the session variable although the value is stored in session memory in some other session. After dropping the related plans are resetted, but the stored value itself stays in memory and can be accessed - if some future variable takes the same oid. With gennum I have 3x checks - that should ensure that the returned value should be always binary valid.

Now, I am thinking about another, maybe more simple identity check, and it should to work and it can less code than solution based on type's fingerprints

I can introduce a 64bit sequence and I can store the value of seq in pg_variable record. Then the identity check can be just savedoid = oid and savedseqnum = seqnum

What do you think about this idea? The overhead of that can be reduced, because for on transaction commit drop or on transaction end reset session variables we don't need it.


 

diff --git a/src/backend/commands/sessionvariable.c b/src/backend/commands/sessionvariable.c
index 9b4f9482a4..7c8808dc46 100644
--- a/src/backend/commands/sessionvariable.c
+++ b/src/backend/commands/sessionvariable.c
@@ -794,6 +794,8 @@ svartype_verify_composite_fast(SVariableType svt)
 static int64
 get_svariable_valid_type_gennum(SVariableType svt)
 {
+   return 1;
+
    HeapTuple   tuple;
    bool        fast_check = true;

@@ -905,6 +907,8 @@ get_svariabletype(Oid typid)
 static bool
 session_variable_use_valid_type(SVariable svar)
 {
+   return true;
+
    Assert(svar);
    Assert(svar->svartype);

And session_variable.sql regression test still works just fine.  Am I missing
something?

the regress test doesn't try to reset oid counter
 

While at it, the initial comment should probably say "free local memory" rather
than "purge memory".

changed
 

- doc are missing for GRANT/REVOKE ... ON ALL VARIABLES

done
 

- plpgsql.sgml:
+   <sect3>
+    <title><command>Session variables and constants</command></title>


rewroted just to "Session variables"

 
I don't think it's ok to use "constant" as an alias for immutable session
variable as immutable session variable can actually be changed.

Similarly, in catalogs.sgml:

+       <structfield>varisimmutable</structfield> <type>boolean</type>
+      </para>
+      <para>
+       True if the variable is immutable (cannot be modified). The default value is false.
+      </para></entry>
+     </row>

I think there should be a note and a link to the corresponding part in
create_variable.sgml to explain what exactly is an immutable variable, as the
implemented behavior (for nullable immutable variable) is somewhat unexpected.

done
 

- other nitpicking: pg_variable and struct Variable seems a bit inconsistent.
  For instance one uses vartype and vartypmod and the other typid and typmod,
  while both use varname and varnamespace.  I think we should avoid discrepancy
  here.

I did it because I needed to rename the namespace field, but the prefix var is not the best. I don't think so using same names like pg_variable in Variable is good idea (due fields like varisnotnull, varisimmutable), but I can the rename varnane and varnamespace to name and namespaceid, what is better than varname, and varnamespace.


Also, there's a sessionvariable.c and a session_variable.h.  Let's use
session_variable.[ch], as it seems more readable?

renamed
 

-typedef patch: missing SVariableTypeData, some commits need a pgindent, e.g:

+typedef SVariableTypeData * SVariableType;

+typedef SVariableData * SVariable;

+static SessionVariableValue * RestoreSessionVariables(char **start_address,
+                                                   int *num_session_variables);

+static Query *transformLetStmt(ParseState *pstate,
+                              LetStmt * stmt);

(and multiple others)

I fixed these.

Thank you for comments

Pavel
 
Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: standby promotion can create unreadable WAL
Next
From: Pavel Stehule
Date:
Subject: Re: Schema variables - new implementation for Postgres 15