Re: Modernizing our GUC infrastructure - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Modernizing our GUC infrastructure |
Date | |
Msg-id | 20220905233233.jhcu5jqsrtosmgh5@awork3.anarazel.de Whole thread Raw |
In response to | Modernizing our GUC infrastructure (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Modernizing our GUC infrastructure
|
List | pgsql-hackers |
Hi, > I wrote this because I am starting to question the schema-variables patch > [1] --- that's getting to be quite a large patch and I grow less and less > sure that it's solving a problem our users want solved. --- that's getting > to be quite a large patch and I grow less and less sure that it's solving a > problem our users want solved. I think what people actually want is better > support of the existing mechanism for ad-hoc session variables, namely > abusing custom GUCs for that purpose. I don't really have an opinion on the highlevel directional question, yet anyway. But the stuff you're talking about changing in guc.c seem like a good idea independently. On 2022-09-05 18:27:46 -0400, Tom Lane wrote: > 0001 and 0002 below are concerned with converting guc.c to store its > data in a dedicated memory context (GUCMemoryContext) instead of using > raw malloc(). This is not directly a performance improvement, and > I'm prepared to drop the idea if there's a lot of pushback, but I > think it would be a good thing to do. +1 - I've been annoyed at this a couple times, even just because it makes it harder to identify memory leaks etc. > The only hard reason for using > malloc() there was the lack of ability to avoid throwing elog(ERROR) > on out-of-memory in palloc(). But mcxt.c grew that ability years ago. > Switching to a dedicated context would greatly improve visibility and > accountability of GUC-related allocations. Also, the 0003 patch will > switch guc.c to relying on a palloc-based hashtable, and it seems a > bit silly to have part of the data structure in palloc and part in > malloc. However 0002 is a bit invasive, in that it forces code > changes in GUC check callbacks, if they want to reallocate the new > value or create an "extra" data structure. My feeling is that not > enough external modules use those facilities for this to pose a big > problem. However, the ones that are subject to it will have a > non-fun time tracking down why their code is crashing. That sucks, but I think it's a bullet we're going to have to bite at some point. Perhaps we could do something like checking MemoryContextContains() and assert if not allocated in the right context? That way the crash is at least obvious. Or perhaps even transparently reallocate in that case? It does look like MemoryContextContains() currently is broken, I've raised that in the other thread. > (The recent context-header changes mean that you get a very obscure failure > when trying to pfree() a malloc'd chunk -- for me, that typically ends in an > assertion failure in generation.c. Can we make that less confusing?) Hm. We can do better in assert builds, but I'm not sure we want to add the overhead of explicit checks in normal builds, IIRC David measured the overhead of additional branches in pfree, and it was noticable. > 0003 replaces guc.c's bsearch-a-sorted-array lookup infrastructure > with a dynahash hash table. (I also looked at simplehash, but it > has no support for not elog'ing on OOM, and it increases the size > of guc.o by 10KB or so.) Dynahash seems reasonable here. Hard to believe raw lookup speed is a relevant bottleneck and due to the string names the key would be pretty wide (could obviously just be done via pointer, but then the locality benefits aren't as big). > However, the places that were linearly scanning the array now need to use > hash_seq_search, so some other things like transaction shutdown > (AtEOXact_GUC) get slower. > > To address that, 0004 adds some auxiliary lists that link together > just the variables that are interesting for specific purposes. Seems sane. It's only half related, but since we're talking about renovating guc.c: I think it'd be good if we split the list of GUCs from the rest of the guc machinery. Both for humans and compilers it's getting pretty large. And commonly one either wants to edit the definition of GUCs or wants to edit the GUC machinery. Greetings, Andres Freund
pgsql-hackers by date: