Re: Schema variables - new implementation for Postgres 15 - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: Schema variables - new implementation for Postgres 15 |
Date | |
Msg-id | 20220916035904.6qdtcdl44lycltrg@jrouhaud Whole thread Raw |
In response to | Re: Schema variables - new implementation for Postgres 15 (Pavel Stehule <pavel.stehule@gmail.com>) |
List | pgsql-hackers |
Hi, On Sun, Sep 11, 2022 at 09:29:49PM +0200, Pavel Stehule wrote: >> >> Originally it was not possible, because there was no xact_reset_varids list, and without this list the processing >> ON_COMMIT_DROP started DROP VARIABLE command, and there was a request for ON_COMMIT_RESET action. >> Now, it is possible, because in RemoveSessionVariable is conditional execution: >> >> <--><--><-->if (!svar->eox_reset) >> <--><--><--><-->register_session_variable_xact_action(varid, >> <--><--><--><--><--><--><--><--><--><--><--><--><--> SVAR_ON_COMMIT_RESET); >> <--><-->} >> >> So when we process ON_COMMIT_DROP actions, we know that the reset will not be processed by ON_COMMIT_RESET action, >> and then these lists can be merged. >> >> so I merged these two lists to one Thanks! This really helps with code readability, and after looking at it I found some issues (see below). > > changes: > > - some minor cleaning > - refactoring of RemoveSessionVariable - move part of code to pg_variable.c Thanks. I think we could still do more to split what code belongs to pg_variable.c and session_variable.c. In my opinion, the various DDL code should only invoke functions in pg_variable.c, which themselves can call function in session_variable.c if needed, and session_variable shouldn't know about CreateSessionVarStmt (which should probably be rename CreateVariableStmt?) or VariableRelationId. After an off-list bikeshedding session with Pavel, we came up with SessionVariableCreatePostprocess() and SessionVariableDropPostprocess() for the functions in session_variable.c called by pg_variable.c when handling CREATE VARIABLE and DROP VARIABLE commands. I'm attaching a new patchset with this change and some more (see below). I'm not sending .txt files as this is rebased on top on the recent GUC refactoring patch. It won't change the cfbot outcome though, as I also add new regression tests that are for now failing (see below). I tried to keep the changes in extra "FIXUP" patches when possible, but the API changes in the first patch cause conflicts in the next one, so the big session variable patch has to contain the needed changes. In this patchset, I also changed the following: - global pass on the comments in session_variable - removed now useless sessionvars_types - added missing prototypes for static functions (for consistency), and moved all the static functions before the static function - minor other nitpicking / stylistic changes Here are the problems I found: - IdentifyVariable() /* * Lock relation. This will also accept any pending invalidation * messages. If we got back InvalidOid, indicating not found, then * there's nothing to lock, but we accept invalidation messages * anyway, to flush any negative catcache entries that may be * lingering. */ + if (!OidIsValid(varid)) + AcceptInvalidationMessages(); + else if (OidIsValid(varid)) + LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock); + + if (inval_count == SharedInvalidMessageCounter) + break; + + retry = true; + old_varid = varid; + } AFAICS it's correct, but just to be extra cautious I'd explicitly set varid to InvalidOid before looping, so you restart in the same condition as the first iteration (since varid is initialize when declared). Also, the comments should be modified, it's "Lock variable", not "Lock relation", same for the comment in the previous chunk ("we've locked the relation that used to have this name..."). +Datum +pg_debug_show_used_session_variables(PG_FUNCTION_ARGS) +{ +[...] + else + { + /* + * When session variable was removed from catalog, but still + * it in memory. The memory was not purged yet. + */ + nulls[1] = true; + nulls[2] = true; + nulls[4] = true; + values[5] = BoolGetDatum(true); + nulls[6] = true; + nulls[7] = true; + nulls[8] = true; + } I'm wondering if we could try to improve things a bit here. Maybe display the variable oid instead of its name as we still have that information, the type (using FORMAT_TYPE_ALLOW_INVALID as there's no guarantee that the type would still exist without the dependency) and whether the variable is valid (at least per its stored value). We can keep NULL for the privileges, as there's no API avoid erroring if the role has been dropped. +{ oid => '8488', descr => 'debug list of used session variables', + proname => 'pg_debug_show_used_session_variables', prorows => '1000', proretset => 't', + provolatile => 's', prorettype => 'record', proargtypes => '', + proallargtypes => '{oid,text,text,oid,text,bool,bool,bool,bool}', + proargmodes => '{o,o,o,o,o,o,o,o,o}', + proargnames => '{varid,schema,name,typid,typname,removed,has_value,can_read,can_write}', Since we change READ / WRITE acl for SELECT / UPDATE, we should rename the column can_select and can_update. +static void +pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue) +{ + [...] + while ((svar = (SVariable) hash_seq_search(&status)) != NULL) + { + if (hashvalue == 0 || svar->hashvalue == hashvalue) + { + [...] + xact_recheck_varids = list_append_unique_oid(xact_recheck_varids, + svar->varid); This has a pretty terrible complexity. It can degenerate badly, and there isn't any CHECK_FOR_INTERRUPTS so you could easily lock a backend for quite some time. I think we should just keep appending oids, and do a list_sort(list, list_oid_cmp) and list_deduplicate_oid(list) before processing the list, in sync_sessionvars_all() and AtPreEOXact_SessionVariable_on_xact_actions(). Maybe while at it we could reuse sync_sessionvars_all in AtPreEOXact_SessionVariable_on_xact_actions (with a way to ask for the lxid check or not), rather than duplicating the whole logic twice? +/* + * Fast drop of the complete content of all session variables hash table. + * This is code for DISCARD VARIABLES command. This command + * cannot be run inside transaction, so we don't need to handle + * end of transaction actions. + */ +void +ResetSessionVariables(void) +{ + /* Destroy hash table and reset related memory context */ + if (sessionvars) + { + hash_destroy(sessionvars); + sessionvars = NULL; + + hash_destroy(sessionvars_types); + sessionvars_types = NULL; + } + + /* Release memory allocated by session variables */ + if (SVariableMemoryContext != NULL) + MemoryContextReset(SVariableMemoryContext); + + /* + * There are not any session variables left, so simply trim xact + * action list, and other lists. + */ + list_free_deep(xact_on_commit_actions); + xact_on_commit_actions = NIL; + + /* We should clean xact_reset_varids */ + list_free(xact_reset_varids); + xact_reset_varids = NIL; + + /* we should clean xact_recheck_varids */ + list_free(xact_recheck_varids); + xact_recheck_varids = NIL; +} The initial comment is wrong. This function is used for both DISCARD VARIABLES and DISCARD ALL, but only DISCARD ALL isn't allowed in a transaction (I fixed the comment in the attached patchset). We should allow DISCARD VARIABLES in a transaction, therefore it needs some more thinking on which list can be freed, and in which context it should hold its data. AFAICS the only problematic case is ON COMMIT DROP, but an extra check wouldn't hurt. For instance: rjuju=# BEGIN; BEGIN rjuju=# CREATE TEMP VARIABLE v AS int ON COMMIT DROP; CREATE VARIABLE rjuju=# DISCARD VARIABLES ; DISCARD VARIABLES rjuju=# COMMIT; COMMIT rjuju=# \dV List of variables Schema | Name | Type | Collation | Nullable | Mutable | Default | Owner | Transactional end action -----------+------+---------+-----------+----------+---------+---------+-------+-------------------------- pg_temp_3 | v | integer | | t | t | | rjuju | ON COMMIT DROP (1 row) Note that I still think that keeping a single List for both SVariableXActAction helps for readability, even if it means cherry-picking which items should be removed on DISCARD VARIABLES (which shouldn't be a very frequent operation anyway). Also, xact_recheck_varids is allocated in SVariableMemoryContext, so DISCARD VARIABLE will crash if there's any pending recheck action. There's only one regression test for DISCARD VARIABLE, which clearly wasn't enough. There should be one for the ON COMMIT DROP (which can be added in normal regression test), one one with all action list populated (that need to be in isolation tester). Both are added in the patchset in a suggestion patch, and for now the first test fails and the second crashes. - set_session_variable() is documented to either succeed or not change the currently set value. While it's globally true, I see 2 things that could be problematic: - free_session_variable_value() could technically fail. However, I don't see how it could be happening unless there's a memory corruption, so this would result in either an abort, or a backend in a very bad state. Anyway, since pfree() can clearly ereport(ERROR) we should probably do something about it. That being said, I don't really see the point of trying to preserve a value that looks like random pointer, which will probably cause a segfault the next time it's used. Maybe add a PG_TRY block around the call and mark it as invalid (and set freeval to false) if that happens? - the final elog(DEBUG1) can also fail. It also seems highly unlikely, so maybe accept that this exception is ok? For now I'm adding such a comment in a suggestion patch. - prepare_variable_for_reading() and SetSessionVariable(): + /* Ensure so all entries in sessionvars hash table are valid */ + sync_sessionvars_all(); + + /* Protect used session variable against drop until transaction end */ + LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock); It's possible that a session variable is dropped after calling sync_sessionvars_all(), and we would receive the sinval when acquiring the lock on VariableRelationId but not process it until the next sync_sessionvars_all call. I think we should acquire the lock first and then call sync_sessionvars_all. I did that in the suggestion patch.
Attachment
- v20220916-0001-catalog-support-for-session-variables.patch
- v20220916-0002-FIXUP-0001-catalog-support-for-session-var.patch
- v20220916-0003-session-variables.patch
- v20220916-0004-FIXUP-0003-session-variables.patch
- v20220916-0005-LET-command.patch
- v20220916-0006-support-of-LET-command-in-PLpgSQL.patch
- v20220916-0007-DISCARD-VARIABLES-command.patch
- v20220916-0008-enhancing-psql-for-session-variables.patch
- v20220916-0009-possibility-to-dump-session-variables-by-p.patch
- v20220916-0010-regress-tests-for-session-variables.patch
- v20220916-0011-FIXUP-regress-tests-for-session-variables.patch
- v20220916-0012-this-patch-changes-error-message-column-do.patch
- v20220916-0013-documentation.patch
pgsql-hackers by date: