Re: Schema variables - new implementation for Postgres 15 - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: Schema variables - new implementation for Postgres 15 |
Date | |
Msg-id | CAOBaU_bKPU+zGxmB-d-CgTPin=d0FH+qyf_ef_Z-ftdZAQ-Oiw@mail.gmail.com Whole thread Raw |
In response to | Re: Schema variables - new implementation for Postgres 15 (Julien Rouhaud <rjuju123@gmail.com>) |
List | pgsql-hackers |
Hi, The patch has rotten again, sending an updated version. Also, after talking with Pavel, he can't work on this patch before a few days so I'm adding some extra fixup patches for the things I reported in the last few emails, so that the cfbot can hopefully turn green. On Thu, Sep 22, 2022 at 2:41 PM Julien Rouhaud <rjuju123@gmail.com> wrote: > > On Fri, Sep 16, 2022 at 11:59:04AM +0800, Julien Rouhaud wrote: > > Hi, > > > > On Sun, Sep 11, 2022 at 09:29:49PM +0200, Pavel Stehule wrote: > > >> > > >> Originally it was not possible, because there was no xact_reset_varids list, and without this list the processing > > >> ON_COMMIT_DROP started DROP VARIABLE command, and there was a request for ON_COMMIT_RESET action. > > >> Now, it is possible, because in RemoveSessionVariable is conditional execution: > > >> > > >> <--><--><-->if (!svar->eox_reset) > > >> <--><--><--><-->register_session_variable_xact_action(varid, > > >> <--><--><--><--><--><--><--><--><--><--><--><--><--> SVAR_ON_COMMIT_RESET); > > >> <--><-->} > > >> > > >> So when we process ON_COMMIT_DROP actions, we know that the reset will not be processed by ON_COMMIT_RESET action, > > >> and then these lists can be merged. > > >> > > >> so I merged these two lists to one > > > > Thanks! This really helps with code readability, and after looking at it I > > found some issues (see below). > > > > > > changes: > > > > > > - some minor cleaning > > > - refactoring of RemoveSessionVariable - move part of code to pg_variable.c > > > > Thanks. I think we could still do more to split what code belongs to > > pg_variable.c and session_variable.c. In my opinion, the various DDL code > > should only invoke functions in pg_variable.c, which themselves can call > > function in session_variable.c if needed, and session_variable shouldn't know > > about CreateSessionVarStmt (which should probably be rename > > CreateVariableStmt?) or VariableRelationId. After an off-list bikeshedding > > session with Pavel, we came up with SessionVariableCreatePostprocess() and > > SessionVariableDropPostprocess() for the functions in session_variable.c called > > by pg_variable.c when handling CREATE VARIABLE and DROP VARIABLE commands. > > > > I'm attaching a new patchset with this change and some more (see below). I'm > > not sending .txt files as this is rebased on top on the recent GUC refactoring > > patch. It won't change the cfbot outcome though, as I also add new regression > > tests that are for now failing (see below). I tried to keep the changes in > > extra "FIXUP" patches when possible, but the API changes in the first patch > > cause conflicts in the next one, so the big session variable patch has to > > contain the needed changes. > > > > In this patchset, I also changed the following: > > > > - global pass on the comments in session_variable > > - removed now useless sessionvars_types > > - added missing prototypes for static functions (for consistency), and moved > > all the static functions before the static function > > - minor other nitpicking / stylistic changes > > > > Here are the problems I found: > > > > - IdentifyVariable() > > > > /* > > * Lock relation. This will also accept any pending invalidation > > * messages. If we got back InvalidOid, indicating not found, then > > * there's nothing to lock, but we accept invalidation messages > > * anyway, to flush any negative catcache entries that may be > > * lingering. > > */ > > + if (!OidIsValid(varid)) > > + AcceptInvalidationMessages(); > > + else if (OidIsValid(varid)) > > + LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock); > > + > > + if (inval_count == SharedInvalidMessageCounter) > > + break; > > + > > + retry = true; > > + old_varid = varid; > > + } > > > > AFAICS it's correct, but just to be extra cautious I'd explicitly set varid to > > InvalidOid before looping, so you restart in the same condition as the first > > iteration (since varid is initialize when declared). Also, the comments should > > be modified, it's "Lock variable", not "Lock relation", same for the comment in > > the previous chunk ("we've locked the relation that used to have this > > name..."). > > > > +Datum > > +pg_debug_show_used_session_variables(PG_FUNCTION_ARGS) > > +{ > > +[...] > > + else > > + { > > + /* > > + * When session variable was removed from catalog, but still > > + * it in memory. The memory was not purged yet. > > + */ > > + nulls[1] = true; > > + nulls[2] = true; > > + nulls[4] = true; > > + values[5] = BoolGetDatum(true); > > + nulls[6] = true; > > + nulls[7] = true; > > + nulls[8] = true; > > + } > > > > I'm wondering if we could try to improve things a bit here. Maybe display the > > variable oid instead of its name as we still have that information, the type > > (using FORMAT_TYPE_ALLOW_INVALID as there's no guarantee that the type would > > still exist without the dependency) and whether the variable is valid (at least > > per its stored value). We can keep NULL for the privileges, as there's no API > > avoid erroring if the role has been dropped. > > > > +{ oid => '8488', descr => 'debug list of used session variables', > > + proname => 'pg_debug_show_used_session_variables', prorows => '1000', proretset => 't', > > + provolatile => 's', prorettype => 'record', proargtypes => '', > > + proallargtypes => '{oid,text,text,oid,text,bool,bool,bool,bool}', > > + proargmodes => '{o,o,o,o,o,o,o,o,o}', > > + proargnames => '{varid,schema,name,typid,typname,removed,has_value,can_read,can_write}', > > > > Since we change READ / WRITE acl for SELECT / UPDATE, we should rename the > > column can_select and can_update. > > > > +static void > > +pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue) > > +{ > > + [...] > > + while ((svar = (SVariable) hash_seq_search(&status)) != NULL) > > + { > > + if (hashvalue == 0 || svar->hashvalue == hashvalue) > > + { > > + [...] > > + xact_recheck_varids = list_append_unique_oid(xact_recheck_varids, > > + svar->varid); > > > > This has a pretty terrible complexity. It can degenerate badly, and there > > isn't any CHECK_FOR_INTERRUPTS so you could easily lock a backend for quite > > some time. > > > > I think we should just keep appending oids, and do a list_sort(list, > > list_oid_cmp) and list_deduplicate_oid(list) before processing the list, in > > sync_sessionvars_all() and AtPreEOXact_SessionVariable_on_xact_actions(). > > > > Maybe while at it we could reuse sync_sessionvars_all in > > AtPreEOXact_SessionVariable_on_xact_actions (with a way to ask > > for the lxid check or not), rather than duplicating the whole logic twice? > > > > +/* > > + * Fast drop of the complete content of all session variables hash table. > > + * This is code for DISCARD VARIABLES command. This command > > + * cannot be run inside transaction, so we don't need to handle > > + * end of transaction actions. > > + */ > > +void > > +ResetSessionVariables(void) > > +{ > > + /* Destroy hash table and reset related memory context */ > > + if (sessionvars) > > + { > > + hash_destroy(sessionvars); > > + sessionvars = NULL; > > + > > + hash_destroy(sessionvars_types); > > + sessionvars_types = NULL; > > + } > > + > > + /* Release memory allocated by session variables */ > > + if (SVariableMemoryContext != NULL) > > + MemoryContextReset(SVariableMemoryContext); > > + > > + /* > > + * There are not any session variables left, so simply trim xact > > + * action list, and other lists. > > + */ > > + list_free_deep(xact_on_commit_actions); > > + xact_on_commit_actions = NIL; > > + > > + /* We should clean xact_reset_varids */ > > + list_free(xact_reset_varids); > > + xact_reset_varids = NIL; > > + > > + /* we should clean xact_recheck_varids */ > > + list_free(xact_recheck_varids); > > + xact_recheck_varids = NIL; > > +} > > > > The initial comment is wrong. This function is used for both DISCARD VARIABLES > > and DISCARD ALL, but only DISCARD ALL isn't allowed in a transaction (I fixed > > the comment in the attached patchset). > > We should allow DISCARD VARIABLES in a transaction, therefore it needs some > > more thinking on which list can be freed, and in which context it should hold > > its data. AFAICS the only problematic case is ON COMMIT DROP, but an extra > > check wouldn't hurt. For instance: > > > > rjuju=# BEGIN; > > BEGIN > > > > rjuju=# CREATE TEMP VARIABLE v AS int ON COMMIT DROP; > > CREATE VARIABLE > > > > rjuju=# DISCARD VARIABLES ; > > DISCARD VARIABLES > > > > rjuju=# COMMIT; > > COMMIT > > > > rjuju=# \dV > > List of variables > > Schema | Name | Type | Collation | Nullable | Mutable | Default | Owner | Transactional end action > > -----------+------+---------+-----------+----------+---------+---------+-------+-------------------------- > > pg_temp_3 | v | integer | | t | t | | rjuju | ON COMMIT DROP > > (1 row) > > > > Note that I still think that keeping a single List for both SVariableXActAction > > helps for readability, even if it means cherry-picking which items should be > > removed on DISCARD VARIABLES (which shouldn't be a very frequent operation > > anyway). > > > > Also, xact_recheck_varids is allocated in SVariableMemoryContext, so DISCARD > > VARIABLE will crash if there's any pending recheck action. > > > > There's only one regression test for DISCARD VARIABLE, which clearly wasn't > > enough. There should be one for the ON COMMIT DROP (which can be added in > > normal regression test), one one with all action list populated (that need to > > be in isolation tester). Both are added in the patchset in a suggestion patch, > > and for now the first test fails and the second crashes. > > > > > > - set_session_variable() is documented to either succeed or not change the > > currently set value. While it's globally true, I see 2 things that could be > > problematic: > > > > - free_session_variable_value() could technically fail. However, I don't see > > how it could be happening unless there's a memory corruption, so this would > > result in either an abort, or a backend in a very bad state. Anyway, since > > pfree() can clearly ereport(ERROR) we should probably do something about > > it. That being said, I don't really see the point of trying to preserve a > > value that looks like random pointer, which will probably cause a segfault > > the next time it's used. Maybe add a PG_TRY block around the call and mark > > it as invalid (and set freeval to false) if that happens? > > > > - the final elog(DEBUG1) can also fail. It also seems highly unlikely, so > > maybe accept that this exception is ok? For now I'm adding such a comment > > in a suggestion patch. > > > > - prepare_variable_for_reading() and SetSessionVariable(): > > > > + /* Ensure so all entries in sessionvars hash table are valid */ > > + sync_sessionvars_all(); > > + > > + /* Protect used session variable against drop until transaction end */ > > + LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock); > > > > It's possible that a session variable is dropped after calling > > sync_sessionvars_all(), and we would receive the sinval when acquiring the lock > > on VariableRelationId but not process it until the next sync_sessionvars_all > > call. I think we should acquire the lock first and then call > > sync_sessionvars_all. I did that in the suggestion patch. > > Rebased patcshet against recent conflicts, thanks to Pavel for the reminder. > > While sending a new patch, I realized that I forgot mentionning this in > execMain.c: > > @@ -200,6 +201,61 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags) > Assert(queryDesc->sourceText != NULL); > estate->es_sourceText = queryDesc->sourceText; > > + /* > + * The executor doesn't work with session variables directly. Values of > + * related session variables are copied to dedicated array, and this array > + * is passed to executor. > + */ > + if (queryDesc->num_session_variables > 0) > + { > + /* > + * When paralel access to query parameters (including related session > + * variables) is required, then related session variables are restored > + * (deserilized) in queryDesc already. So just push pointer of this > + * array to executor's estate. > + */ > + estate->es_session_variables = queryDesc->session_variables; > + estate->es_num_session_variables = queryDesc->num_session_variables; > + } > + else if (queryDesc->plannedstmt->sessionVariables) > + { > + ListCell *lc; > + int nSessionVariables; > + int i = 0; > + > + /* > + * In this case, the query uses session variables, but we have to > + * prepare the array with passed values (of used session variables) > + * first. > + */ > + nSessionVariables = list_length(queryDesc->plannedstmt->sessionVariables); > + > + /* Create the array used for passing values of used session variables */ > + estate->es_session_variables = (SessionVariableValue *) > + palloc(nSessionVariables * sizeof(SessionVariableValue)); > + > + /* Fill the array */ > + [...] > + > + estate->es_num_session_variables = nSessionVariables; > + } > > I haven't looked at that part yet, but the comments are a bit obscure. IIUC > the first branch is for parallel workers only, if the main backend provided the > array, and the 2nd chunk is for the main backend. If so, it could be made > clearer, and maybe add an assert about IsParallelWorker() (or > !IsParallelWorker()) as needed? Full list of changes: - rebased against multiple conflicts since last version - fixed the meson build - fixed the ON COMMIT DROP problem and the crash on RESET VARIABLES - fixed some copy/pasto in the expected isolation tests (visible now that it works) - added the asserts and tried to clarify the comments for the session variable handling in QueryDesc (I still haven't really read that part) - did the mentioned modifications on pg_debug_show_used_session_variables, and used CStringGetTextDatum macro to simplify the code Note that while waiting for the CI to finish I noticed that the commit message for 0001 still mentions the READ/WRITE acl. The commit messages will probably need a bit of rewording too once everything else is fixed, but this one could be changed already.
Attachment
pgsql-hackers by date: