Re: GUC thread-safety approaches - Mailing list pgsql-hackers
| From | Matthias van de Meent |
|---|---|
| Subject | Re: GUC thread-safety approaches |
| Date | |
| Msg-id | CAEze2WjYnMOWKEhbSeKi8kV-0U5R=xWPXJkMkPmvzfJKoB5pWA@mail.gmail.com Whole thread Raw |
| In response to | GUC thread-safety approaches (Peter Eisentraut <peter@eisentraut.org>) |
| List | pgsql-hackers |
On Tue, 18 Nov 2025 at 09:50, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> I want to discuss possible approaches to making the GUC system
> thread-safe. In particular, I want to talk about the global
> variables.
>
> A GUC parameter is defined via a struct config_generic record that
> contains some metadata and a pointer to a global variable. For
> example (simplified):
>
> // elsewhere
> bool enable_seqscan = true;
>
> // elsewhere
> extern bool enable_seqscan;
>
> // in guc_tables.inc.c (generated)
> ...
> {
> .name = "enable_seqscan",
> .context = PGC_USERSET,
> .group = QUERY_TUNING_METHOD,
> .short_desc = gettext_noop("Enables the planner's use of
> sequential-scan plans."),
> .flags = GUC_EXPLAIN,
> .vartype = PGC_BOOL,
> ._bool = {
> .variable = &enable_seqscan, // HERE
> .boot_val = true,
> },
> },
>
> For a multithreaded server, one of the ideas thrown around was to
> convert (most) global variables to thread-local variables. That way,
> the overall structure of the code could remain the same, and each
> thread would see the same set of global variables as before.
> So you could do
>
> thread_local bool enable_seqscan = true;
>
> and
>
> extern thread_local bool enable_seqscan;
>
> and as far as the code in optimizer/path/costsize.c or wherever is
> concerned, it would work the same ways as before.
>
> But that doesn't work because:
>
> src/include/utils/guc_tables.inc.c:1617:37: error: initializer element
> is not constant
> 1617 | .variable = &enable_seqscan,
>
> Heikki had developed a workaround for this in his branch[0]: For each
> GUC parameter, create a simple function that returns the address of the
> variable, and the config_generic record stores the address of the
> function. So like this:
>
> static bool *enable_seqscan_address(void) { return &enable_seqscan; }
>
> {
> .name = "enable_seqscan",
> .context = PGC_USERSET,
> .group = QUERY_TUNING_METHOD,
> .short_desc = gettext_noop("Enables the planner's use of
> sequential-scan plans."),
> .flags = GUC_EXPLAIN,
> .vartype = PGC_BOOL,
> ._bool = {
> .var_addr_func = enable_seqscan_address, // HERE
> .boot_val = true,
> },
> },
>
> and then the code in guc.c that reads and sets the values is adjusted
> like this in several places:
>
> - *conf->variable = conf->reset_val;
> + *conf->var_addr_func() = conf->reset_val;
>
> This works.
>
> [0]: see https://wiki.postgresql.org/wiki/Multithreading
>
> Heikki's branch contains some macros to generate those helper
> functions:
>
> #define DEFINE_BOOL_GUC_ADDR(guc) \
> static bool *guc##_address(void) { return &guc; }
>
> DEFINE_BOOL_GUC_ADDR(enable_seqscan)
>
> Note that this requires fixing up every GUC variable definition like
> this.
>
> With the generated guc_tables.inc.c, we could now generate these
> helper functions automatically. But you'd still need to modify each
> variable definition to add the thread_local specification.
>
> Actually, in Heikki's branch this is hidden behind macros and looks
> like this:
>
> session_guc bool enable_seqscan = true;
>
> And then there is additional tooling to check the annotations of all
> global variables, GUC or not, like this.
>
> So with that approach, we could add these kinds of annotations first,
> independent of thread support, and then later on add thread support
> without any further global code changes.
>
> This, however, doesn't work for user-defined GUC parameters in
> extensions.
>
> The interface for that looks like this:
>
> DefineCustomBoolVariable("auto_explain.log_analyze",
> "Use EXPLAIN ANALYZE for plan logging.",
> NULL,
> &auto_explain_log_analyze, // pointer to
> global var
> false,
> PGC_SUSET,
> 0,
> NULL,
> NULL,
> NULL);
>
> In Heikki's branch, the signature of this and related functions are
> changed like this:
>
> extern void DefineCustomBoolVariable(const char *name,
> const char *short_desc,
> const char *long_desc,
> - bool *valueAddr,
> + GucBoolAddressHook addr_hook,
> bool bootValue,
> GucContext context,
> int flags,
>
> And then there are macros like shown earlier and some other ones to
> define the required helper functions and hook this all together.
>
> As written, this would break source-code compatibility for all
> extensions that use these functions. We could conceivably create
> alternative functions like DefineCustomBoolVariableExt() and make the
> old interfaces wrappers around the new ones, or something like that.
> But of course, we would ideally want extensions to adopt the new
> system, whatever it might be, before long.
>
> The point is, while we could probably do this transition with
> relatively little impact on the core code and built-in GUC parameters,
> it appears that extension code will require nontrivial manual work to
> adopt this and also maintain backward compatibility. So we need to
> think this through before shipping those interfaces.
Agreed, we'll need to carefully consider this.
> Now consider furthermore that in some future we might want to decouple
> sessions from threads. There is a lot of work to be done between here
> and there, but it seems a quite plausible idea. At that point, we
> would need to get rid of the thread-local global variables anyway. So
> should we do that now already? If we're going to force extension
> authors to amend their code for this, can we do it so that they only
> have to do it once? It would be kind of annoying if one had to
> support like three different custom-GUC interfaces in an extension
> that wants to support five PostgreSQL major versions.
> What might take the place of the global variables then? Note that it
> cannot just be a struct with fields for all the parameters, because
> that's not extensible. So it would need to be some kind of dynamic
> key-value structure, like a hash table. And we already have that!
> All the GUC records are already in a hash table
>
> static HTAB *guc_hashtab;
>
> which is used for all the utility commands and system views and so on.
>
> Could we use that for getting the current values at run time, too?
>
> So instead of
>
> void
> cost_seqscan(...)
> {
> ...
> path->disabled_nodes = enable_seqscan ? 0 : 1;
> ...
> }
>
> do something like
>
> void
> cost_seqscan(...)
> {
> ...
> path->disabled_nodes = get_config_val_bool("enable_seqscan") ?
> 0 : 1;
> ...
> }
>
> where get_config_val_*() would be a thin wrapper around hash_search()
> (a bit like the existing GetConfigOption() and find_option(), but
> without all the error checking).
I think this would be too expensive for extensions' GUCs, let alone
our own GUCs.
> Would that be too expensive? This would have to be checked in detail,
> of course, but just for this example I note that cost_seqscan() is not
> afraid to do multiple hash table lookups anyway (e.g.,
> get_tablespace_page_costs(), get_restriction_qual_cost()), so this
> would not be an order-of-magnitude change.
> There might also be other
> approaches, like caching some planner settings in PlannerInfo. Worst
> case, as a transition measure, we could add assign hooks that write to
> a global variable on a case-by-case basis.
> My question at this point is, which of these scenarios should we work
> toward? Either work toward thread-local variables and helper
> functions and provide new APIs for extensions. Or work toward getting
> rid of the global variables and use hash-table lookups whenever the
> value is needed, with some caching if necessary. (Or other ideas?)
I think we might generally benefit more from moving GUCs to a
group-at-a-time (well, struct-at-a-time) approach.
Right now, GUCs are individually addressed inside the GUC system, and
the API is built for that. It's worked well for us, but for
thread-local storage I don't think it'll scale - we'd store O(n*t)
pointers if pointers are cached in thread-local state, and would
definitely have to call each of these address-resolving functions
during backend startup (when we setup the GUC tables with
user/database session settings or are otherwise not pre-filled with
their defaults, or for parallel workers restore GUC state from their
leader). That's just not great - the amount of resolveable symbols for
GUCs would need to double.
I think we'd benefit from having each extension maintain a single GUC
struct [^1] whose contents (GUC names, types, offsets in struct, other
related checks) are registered with one call, together with a single
TLS reference resolver function (like used in Heikki's design).
This way, the GUC system only needs to keep track of a single TLS
pointer function per registered set of GUCs, relaxing the overhead in
GUC infra per guc by a few bytes (32 bits should be sufficient for GUC
struct-internal offsets; possibly even 16 bits), and more easily
allowing for caching the TLS pointers (it is assumed that a thread's
TLS doesn't move). Extensions could link as usual, they'd just have to
address the struct's fields instead of global variables.
This would also have the benefit of making GUC state restoration
faster through the use of memcpy/memmove on whole structs, rather than
the current value-at-a-time approach. I believe that even in the
current multiprocessing model we could benefit from this approach.
Note: This would look similar to the "fields in a Session struct"
approach from a GUC user's perspective, but it is more generalized to
enable extensions to have a similar performance profile to native
PostgreSQL.
Maybe we could add headers to generate these structs from lists of
GUCs like how rmgrlist.h works, or possibly have scripts to generate
GUC tables from struct metadata like what gen_node_support.pl does for
Node infrastructure? I suspect it would help making a transition
easier.
Kind regards,
Matthias van de Meent
[^1] for memory efficiency one per modification level would probably
be better; PGC_POSTMASTER wouldn't change as much as SIGHUP or
PGC_BACKEND, and therefore maybe shouldn't be co-located. Notably, PG
itself would only need about 1.9kB to store all GUCs.
pgsql-hackers by date: