Thread: Process initialization labyrinth

Process initialization labyrinth

From

Andres Freund

Date:

02 April 2021, 03:22:40

Hi,

While working on the shared memory stats patch I (not for the first
time), issues with our process initialization.

The concrete issue was that I noticed that some stats early in startup
weren't processed correctly - the stats system wasn't initialized yet. I
consequently added assertions ensuring that we don't try to report stats
before that. Which blew up.

Even in master we report stats well before the pgstat_initialize()
call. E.g. in autovac workers:
        /*
         * Report autovac startup to the stats collector.  We deliberately do
         * this before InitPostgres, so that the last_autovac_time will get
         * updated even if the connection attempt fails.  This is to prevent
         * autovac from getting "stuck" repeatedly selecting an unopenable
         * database, rather than making any progress on stuff it can connect
         * to.
         */

That previously just didn't cause a problem, because we didn't really
need pgstat_initialize() to have happened for stats reporting to work.

In the shared memory stats patch there's no dependency on
pgstat_initialize() knowing MyBackendId anymore (broken out to a
separate function). So I tried moving the stats initialization to
somewhere earlier.


There currently is simply no way of doing that that doesn't cause
duplication, or weird conditions. We can't do it in:

- InitProcess()/InitAuxiliaryProcess(),
  CreateSharedMemoryAndSemaphores() hasn't yet run in EXEC_BACKEND
- below CreateSharedMemoryAndSemaphores(), as that isn't called for each
  backend in !EXEC_BACKEND
- InitPostgres(), because autovac workers report stats before that
- BaseInit(), because it's called before we have a PROC iff !EXEC_BACKEND
- ...

I have now worked around this by generous application of ugly, but I
think we really need to do something about this mazy mess. There's just
an insane amount of duplication, and it's too complicated to remember
more than a few minutes.

I really would like to not see things like

    /*
     * Create a per-backend PGPROC struct in shared memory, except in the
     * EXEC_BACKEND case where this was done in SubPostmasterMain. We must do
     * this before we can use LWLocks (and in the EXEC_BACKEND case we already
     * had to do some stuff with LWLocks).
     */
#ifdef EXEC_BACKEND
    if (!IsUnderPostmaster)
        InitProcess();
#else
    InitProcess();
#endif

Similarly, codeflow like bootstrap.c being involved in bog standard
stuff like starting up wal writer etc, is just pointlessly
confusing. Note that bootstrap itself does *not* go through
AuxiliaryProcessMain(), and thus has yet another set of initialization
needs.


Greetings,

Andres Freund

Re: Process initialization labyrinth

From

Mike Palmiotto

Date:

02 April 2021, 17:18:20

On Thu, Apr 1, 2021 at 8:22 PM Andres Freund <andres@anarazel.de> wrote:
> <snip>
> I have now worked around this by generous application of ugly, but I
> think we really need to do something about this mazy mess. There's just
> an insane amount of duplication, and it's too complicated to remember
> more than a few minutes.
>
> I really would like to not see things like
>
>         /*
>          * Create a per-backend PGPROC struct in shared memory, except in the
>          * EXEC_BACKEND case where this was done in SubPostmasterMain. We must do
>          * this before we can use LWLocks (and in the EXEC_BACKEND case we already
>          * had to do some stuff with LWLocks).
>          */
> #ifdef EXEC_BACKEND
>         if (!IsUnderPostmaster)
>                 InitProcess();
> #else
>         InitProcess();
> #endif
>
> Similarly, codeflow like bootstrap.c being involved in bog standard
> stuff like starting up wal writer etc, is just pointlessly
> confusing. Note that bootstrap itself does *not* go through
> AuxiliaryProcessMain(), and thus has yet another set of initialization
> needs.

I can't really speak to the initial points directly relating to
pgstat/shmem, but for the overall maze-like nature of the startup
code: is there any chance the startup centralization patchset would be
of any help here?
https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3Q5RAoFQgW8EtA@mail.gmail.com

I know you are at least vaguely aware of the efforts there, as you
reviewed the patchset. Figured I should at least bring it up in case
it seemed helpful, or an effort you'd like to re-invigorate.

Thanks,

--
Mike Palmiotto
https://crunchydata.com

Re: Process initialization labyrinth

From

Andres Freund

Date:

02 April 2021, 20:02:47

Hi,

On 2021-04-02 10:18:20 -0400, Mike Palmiotto wrote:
> I can't really speak to the initial points directly relating to
> pgstat/shmem, but for the overall maze-like nature of the startup
> code: is there any chance the startup centralization patchset would be
> of any help here?
> https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3Q5RAoFQgW8EtA@mail.gmail.com

I think parts of it could help, at least. It doesn't really do much
about centralizing / de-mazing the actual initialization of sub-systems,
but it'd make it a bit easier to do so.

Greetings,

Andres Freund

Re: Process initialization labyrinth

From

Mike Palmiotto

Date:

02 April 2021, 20:31:10

On Fri, Apr 2, 2021 at 1:02 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2021-04-02 10:18:20 -0400, Mike Palmiotto wrote:
> > I can't really speak to the initial points directly relating to
> > pgstat/shmem, but for the overall maze-like nature of the startup
> > code: is there any chance the startup centralization patchset would be
> > of any help here?
> > https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3Q5RAoFQgW8EtA@mail.gmail.com
>
> I think parts of it could help, at least. It doesn't really do much
> about centralizing / de-mazing the actual initialization of sub-systems,
> but it'd make it a bit easier to do so.

The patchset needs some work, no doubt. If you think it'd be useful, I
can spend some of my free time addressing any gaps in the design. I'd
hate to see that code go to waste, as I think it may have been a
reasonable first step.

Also not opposed to you taking the patchset and running with it if you prefer.

Thanks,
-- 
Mike Palmiotto
https://crunchydata.com