Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o - Mailing list pgsql-committers

From Andres Freund
Subject Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o
Date
Msg-id 20210807164850.52atvo2qkmtfnzgt@alap3.anarazel.de
Whole thread Raw
In response to Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o  (Andres Freund <andres@anarazel.de>)
List pgsql-committers
Hi,

On 2021-08-07 11:43:07 -0400, Tom Lane wrote:
> So if I have the lay of the land correctly:
> 
> 1. Somebody decided it'd be a great idea for temp file cleanup to send
> stats collector messages.
> 
> 2. Temp file cleanup happens after shmem disconnection.
> 
> 3. This accidentally worked, up to now, because stats transmission happens
> via a separate socket not shared memory.
> 
> 4. We can't keep both #1 and #2 if we'd like to switch to shmem-based
> stats collection.

Sounds accurate to me.


> Intuitively it seems like temp file management should be a low-level,
> backend-local function and therefore should be okay to run after
> shmem disconnection.  I do not have a warm feeling about reversing that
> module layering --- what's to stop someone from breaking things by
> trying to use a temp file in their on_proc_exit or on_shmem_exit hook?

We could just add an assert preventing that from happening. It's hard to
believe that there could be a good reason to use temp files in those hook
points.

I'm somewhat inclined to split InitFileAccess() into two by separating out
InitTemporaryFileAccess() or such. InitFileAccess() would continue to happen
early and register a proc exit hook that errors out when there's a temp file
(as a backstop for non cassert builds). The new InitTemporaryFileAccess()
would happen a bit later and schedule CleanupTempFiles() to happen via
before_shmem_access(). And we add a Assert(!proc_exit_inprogress) to the
routines for opening a temp file.


> Maybe what needs to go overboard is point 1.

Keeping stats of temp files seems useful enough that I'm a bit hesitant to go
there. I guess we could just prevent pgstats_report_tempfile() from being
called when CleanupTempFiles() is called during proc exit, but that doesn't
seem great.


> More generally, this points up the fact that we don't have a well-defined
> module hierarchy that would help us understand what code can safely do which.
> I'm not volunteering to design that, but maybe it needs to happen soon.

I agree. Part of the reason for whacking around process startup (in both
pushed and still pending commits) was that previously it wasn't just poorly
defined, it differed significantly between platforms.  And I'm quite unhappy
with the vagueness in which we defined the meaning of the various shutdown
callbacks ([1]).

I suspect to even get to the point of doing a useful redesign of the module
hierarchy, we'd need to unify more of the process initialization between
EXEC_BACKEND and normal builds.

I've bitten by all this often enough to be motivated to propose
something. However I want to get the basics of the shared memory stats stuff
in first - it's a pain to keep it upated, and we'll need to find and solve all
of the issues it has anyway, even if we go for a redesign of module / startup
/ shutdown layering.

Greetings,

Andres Freund

[1] https://www.postgresql.org/message-id/20210803023612.iziacxk5syn2r4ut%40alap3.anarazel.de



pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o
Next
From: Tom Lane
Date:
Subject: Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o