Re: Schema variables - new implementation for Postgres 15 - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Schema variables - new implementation for Postgres 15
Date
Msg-id 20220916035904.6qdtcdl44lycltrg@jrouhaud
Whole thread Raw
In response to Re: Schema variables - new implementation for Postgres 15  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
Hi,

On Sun, Sep 11, 2022 at 09:29:49PM +0200, Pavel Stehule wrote:
>>
>> Originally it was not possible, because there was no xact_reset_varids list, and without this list the processing
>> ON_COMMIT_DROP started DROP VARIABLE command, and there was a request for ON_COMMIT_RESET action.
>> Now, it is possible, because in RemoveSessionVariable is conditional execution:
>> 
>> <--><--><-->if (!svar->eox_reset)
>> <--><--><--><-->register_session_variable_xact_action(varid,
>> <--><--><--><--><--><--><--><--><--><--><--><--><-->  SVAR_ON_COMMIT_RESET);
>> <--><-->}
>> 
>> So when we process ON_COMMIT_DROP actions, we know that the reset will not be processed by ON_COMMIT_RESET action,
>> and then these lists can be merged.
>> 
>> so I merged these two lists to one

Thanks!  This really helps with code readability, and after looking at it I
found some issues (see below).
>
> changes:
>
> - some minor cleaning
> - refactoring of RemoveSessionVariable  - move part of code to pg_variable.c

Thanks.  I think we could still do more to split what code belongs to
pg_variable.c and session_variable.c.  In my opinion, the various DDL code
should only invoke functions in pg_variable.c, which themselves can call
function in session_variable.c if needed, and session_variable shouldn't know
about CreateSessionVarStmt (which should probably be rename
CreateVariableStmt?) or VariableRelationId.  After an off-list bikeshedding
session with Pavel, we came up with SessionVariableCreatePostprocess() and
SessionVariableDropPostprocess() for the functions in session_variable.c called
by pg_variable.c when handling CREATE VARIABLE and DROP VARIABLE commands.

I'm attaching a new patchset with this change and some more (see below).  I'm
not sending .txt files as this is rebased on top on the recent GUC refactoring
patch.  It won't change the cfbot outcome though, as I also add new regression
tests that are for now failing (see below).  I tried to keep the changes in
extra "FIXUP" patches when possible, but the API changes in the first patch
cause conflicts in the next one, so the big session variable patch has to
contain the needed changes.

In this patchset, I also changed the following:

- global pass on the comments in session_variable
- removed now useless sessionvars_types
- added missing prototypes for static functions (for consistency), and moved
  all the static functions before the static function
- minor other nitpicking / stylistic changes

Here are the problems I found:

- IdentifyVariable()

        /*
         * Lock relation.  This will also accept any pending invalidation
         * messages.  If we got back InvalidOid, indicating not found, then
         * there's nothing to lock, but we accept invalidation messages
         * anyway, to flush any negative catcache entries that may be
         * lingering.
         */
+        if (!OidIsValid(varid))
+            AcceptInvalidationMessages();
+        else if (OidIsValid(varid))
+            LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
+
+        if (inval_count == SharedInvalidMessageCounter)
+            break;
+
+        retry = true;
+        old_varid = varid;
+    }

AFAICS it's correct, but just to be extra cautious I'd explicitly set varid to
InvalidOid before looping, so you restart in the same condition as the first
iteration (since varid is initialize when declared).  Also, the comments should
be modified, it's "Lock variable", not "Lock relation", same for the comment in
the previous chunk ("we've locked the relation  that used to have this
name...").

+Datum
+pg_debug_show_used_session_variables(PG_FUNCTION_ARGS)
+{
+[...]
+            else
+            {
+                /*
+                 * When session variable was removed from catalog, but still
+                 * it in memory. The memory was not purged yet.
+                 */
+                nulls[1] = true;
+                nulls[2] = true;
+                nulls[4] = true;
+                values[5] = BoolGetDatum(true);
+                nulls[6] = true;
+                nulls[7] = true;
+                nulls[8] = true;
+            }

I'm wondering if we could try to improve things a bit here.  Maybe display the
variable oid instead of its name as we still have that information, the type
(using FORMAT_TYPE_ALLOW_INVALID as there's no guarantee that the type would
still exist without the dependency) and whether the variable is valid (at least
per its stored value).  We can keep NULL for the privileges, as there's no API
avoid erroring if the role has been dropped.

+{ oid => '8488', descr => 'debug list of used session variables',
+  proname => 'pg_debug_show_used_session_variables', prorows => '1000', proretset => 't',
+  provolatile => 's', prorettype => 'record', proargtypes => '',
+  proallargtypes => '{oid,text,text,oid,text,bool,bool,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o}',
+  proargnames => '{varid,schema,name,typid,typname,removed,has_value,can_read,can_write}',

Since we change READ / WRITE acl for SELECT / UPDATE, we should rename the
column can_select and can_update.

+static void
+pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)
+{
+ [...]
+    while ((svar = (SVariable) hash_seq_search(&status)) != NULL)
+    {
+        if (hashvalue == 0 || svar->hashvalue == hashvalue)
+        {
+ [...]
+            xact_recheck_varids = list_append_unique_oid(xact_recheck_varids,
+                                                         svar->varid);

This has a pretty terrible complexity.  It can degenerate badly, and there
isn't any CHECK_FOR_INTERRUPTS so you could easily lock a backend for quite
some time.

I think we should just keep appending oids, and do a list_sort(list,
list_oid_cmp) and list_deduplicate_oid(list) before processing the list, in
sync_sessionvars_all() and AtPreEOXact_SessionVariable_on_xact_actions().

Maybe while at it we could reuse sync_sessionvars_all in
AtPreEOXact_SessionVariable_on_xact_actions (with a way to ask
for the lxid check or not), rather than duplicating the whole logic twice?

+/*
+ * Fast drop of the complete content of all session variables hash table.
+ * This is code for DISCARD VARIABLES command. This command
+ * cannot be run inside transaction, so we don't need to handle
+ * end of transaction actions.
+ */
+void
+ResetSessionVariables(void)
+{
+    /* Destroy hash table and reset related memory context */
+    if (sessionvars)
+    {
+        hash_destroy(sessionvars);
+        sessionvars = NULL;
+
+        hash_destroy(sessionvars_types);
+        sessionvars_types = NULL;
+    }
+
+    /* Release memory allocated by session variables */
+    if (SVariableMemoryContext != NULL)
+        MemoryContextReset(SVariableMemoryContext);
+
+    /*
+     * There are not any session variables left, so simply trim xact
+     * action list, and other lists.
+     */
+    list_free_deep(xact_on_commit_actions);
+    xact_on_commit_actions = NIL;
+
+    /* We should clean xact_reset_varids */
+    list_free(xact_reset_varids);
+    xact_reset_varids = NIL;
+
+    /* we should clean xact_recheck_varids */
+    list_free(xact_recheck_varids);
+    xact_recheck_varids = NIL;
+}

The initial comment is wrong.  This function is used for both DISCARD VARIABLES
and DISCARD ALL, but only DISCARD ALL isn't allowed in a transaction (I fixed
the comment in the attached patchset).
We should allow DISCARD VARIABLES in a transaction, therefore it needs some
more thinking on which list can be freed, and in which context it should hold
its data.  AFAICS the only problematic case is ON COMMIT DROP, but an extra
check wouldn't hurt.  For instance:

rjuju=# BEGIN;
BEGIN

rjuju=# CREATE TEMP VARIABLE v AS int ON COMMIT DROP;
CREATE VARIABLE

rjuju=# DISCARD VARIABLES ;
DISCARD VARIABLES

rjuju=# COMMIT;
COMMIT

rjuju=# \dV
                                            List of variables
  Schema   | Name |  Type   | Collation | Nullable | Mutable | Default | Owner | Transactional end action
-----------+------+---------+-----------+----------+---------+---------+-------+--------------------------
 pg_temp_3 | v    | integer |           | t        | t       |         | rjuju | ON COMMIT DROP
(1 row)

Note that I still think that keeping a single List for both SVariableXActAction
helps for readability, even if it means cherry-picking which items should be
removed on DISCARD VARIABLES (which shouldn't be a very frequent operation
anyway).

Also, xact_recheck_varids is allocated in SVariableMemoryContext, so DISCARD
VARIABLE will crash if there's any pending recheck action.

There's only one regression test for DISCARD VARIABLE, which clearly wasn't
enough.  There should be one for the ON COMMIT DROP (which can be added in
normal regression test), one one with all action list populated (that need to
be in isolation tester).  Both are added in the patchset in a suggestion patch,
and for now the first test fails and the second crashes.


- set_session_variable() is documented to either succeed or not change the
  currently set value.  While it's globally true, I see 2 things that could be
  problematic:

  - free_session_variable_value() could technically fail.  However, I don't see
    how it could be happening unless there's a memory corruption, so this would
    result in either an abort, or a backend in a very bad state.  Anyway, since
    pfree() can clearly ereport(ERROR) we should probably do something about
    it.  That being said, I don't really see the point of trying to preserve a
    value that looks like random pointer, which will probably cause a segfault
    the next time it's used.  Maybe add a PG_TRY block around the call and mark
    it as invalid (and set freeval to false) if that happens?

  - the final elog(DEBUG1) can also fail.  It also seems highly unlikely, so
    maybe accept that this exception is ok?  For now I'm adding such a comment
    in a suggestion patch.

- prepare_variable_for_reading() and SetSessionVariable():

+    /* Ensure so all entries in sessionvars hash table are valid */
+    sync_sessionvars_all();
+
+    /* Protect used session variable against drop until transaction end */
+    LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);

It's possible that a session variable is dropped after calling
sync_sessionvars_all(), and we would receive the sinval when acquiring the lock
on VariableRelationId but not process it until the next sync_sessionvars_all
call.  I think we should acquire the lock first and then call
sync_sessionvars_all.  I did that in the suggestion patch.

Attachment

pgsql-hackers by date:

Previous
From: Japin Li
Date:
Subject: Re: Typo in xact.c
Next
From: Japin Li
Date:
Subject: Re: Typo in xact.c