Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o - Mailing list pgsql-committers

From Tom Lane
Subject Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o
Date
Msg-id 2771326.1628356007@sss.pgh.pa.us
Whole thread Raw
In response to Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o  (Andres Freund <andres@anarazel.de>)
Responses Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o  (Andres Freund <andres@anarazel.de>)
List pgsql-committers
Andres Freund <andres@anarazel.de> writes:
> On 2021-08-07 11:43:07 -0400, Tom Lane wrote:
>> Intuitively it seems like temp file management should be a low-level,
>> backend-local function and therefore should be okay to run after
>> shmem disconnection.  I do not have a warm feeling about reversing that
>> module layering --- what's to stop someone from breaking things by
>> trying to use a temp file in their on_proc_exit or on_shmem_exit hook?

> We could just add an assert preventing that from happening. It's hard to
> believe that there could be a good reason to use temp files in those hook
> points.

The bigger picture here is that anyplace anybody ever wants to add stats
collection in suddenly becomes a "must run before shmem disconnection"
module.  I think that way madness lies --- we can't have the entire
backend shut down before shmem disconnection.  So I feel like there's
a serious design problem here, and it's not confined to temp files,
or at least it won't stay confined to temp files.  (There may indeed
be other problems already, that we just haven't had the good luck to
have buildfarm timing vagaries expose to us.)

Maybe the solution is to acknowledge that we might lose some events
during backend shutdown, and redefine the behavior as "we ignore
event reports after pgstat shutdown", not "we assert that there never
can be any such reports".

> I'm somewhat inclined to split InitFileAccess() into two by separating out
> InitTemporaryFileAccess() or such. InitFileAccess() would continue to happen
> early and register a proc exit hook that errors out when there's a temp file
> (as a backstop for non cassert builds). The new InitTemporaryFileAccess()
> would happen a bit later and schedule CleanupTempFiles() to happen via
> before_shmem_access(). And we add a Assert(!proc_exit_inprogress) to the
> routines for opening a temp file.

Maybe that would work, but after you multiply it by a bunch of different
scenarios in different modules, it's going to get less and less attractive.

> I've bitten by all this often enough to be motivated to propose
> something. However I want to get the basics of the shared memory stats stuff
> in first - it's a pain to keep it upated, and we'll need to find and solve all
> of the issues it has anyway, even if we go for a redesign of module / startup
> / shutdown layering.

Fair.  But I suggest that the first cut should look more like what
I suggest above, ie just be willing to lose events during shutdown.
The downsides of that are not so enormous that we should be willing
to undertake major klugery to avoid it before we've even got a
semi-working system.

            regards, tom lane



pgsql-committers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o
Next
From: Andres Freund
Date:
Subject: Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o