straightening out backend process startup - Mailing list pgsql-hackers

From Andres Freund
Subject straightening out backend process startup
Date
Msg-id 20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
Whole thread Raw
Responses Re: straightening out backend process startup
Re: straightening out backend process startup
Re: straightening out backend process startup
List pgsql-hackers
Hi,

I've previously complained ([1]) that process initialization has gotten
very complicated. I hit this once more last week when trying to commit
one of the shared memory stats pieces...

I think there's quite a few different issues around this - here I'm just
trying to tackle a few of the most glaring (to me):

- AuxiliaryProcessMain() is used for two independent tasks: Start bootstrap /
  checker mode and starting auxiliary processes. In HEAD there's maybe 5 lines
  out 250 that are actually common to both uses.

  A related oddity is that we reserve shared memory resources for bootstrap &
  checker aux processes, despite those never existing.

  This is addressed in patches 1-7

- The order of invocation of InitProcess()/InitAuxiliaryProcess() and
  BaseInit() depends on EXEC_BACKEND. Due to that there often is no single
  place initialization code can be put if it needs any locks.

  This is addressed in patches 8-9

- PostgresMain() has code for single user and multi user interleaved, making
  it unnecessarily hard to understand what's going on.

  This is addressed in patches 10


This isn't a patch series ready to commit, there's a bunch of polishing that
needs to be done if there's agreement.


Questions I have:

- What exactly to do with checker mode: Keep it as part of bootstrap, separate
  it out completely? What commandline flags?

- I used a separate argv entry to pass the aux proc type - do we rather want
  to go for the approach that e.g. bgworker went for? Adding code for string
  splitting seems a bit unnecessary to me.

- PostgresMainSingle() should probably not be in postgres.c. We could put it
  into postinit.c or ..?

- My first attempt at PostgresMainSingle() separated the single/multi user
  cases a bit more than the code right now, by having a PostgresMainCommon()
  which was called by PostgresMainSingle(), PostgresMain(). *Common only
  started with the MessageContext allocation, which did have the advantage of
  splitting out a few of the remaining conditional actions in PostgresMain()
  (PostmasterContext, welcome banner, Log_disconnections). But lead to a bit
  more duplication. I don't really have an opinion on what's better.

- I had to move the PgStartTime computation to a bit earlier for single user
  mode. That seems to make sense to me anyway, given that postmaster does so
  fairly early too.

  Any reason that'd be a bad idea?

  Arguably it should even be a tad earlier to be symmetric.


There's one further issue that I think is big enough to be worth
tackling in the near future: Too many things depend on BackendIds.

Aux processes need procsignal and backend status reporting, which use
BackendId for indexing. But they don't use sinval, so they don't have a
BackendId - so we have hacks to work around that in a few places. If we
instead make those places use pgprocno for indexing the whole issue
vanishes.

In fact, I think there's a good argument to be made that we should
entirely remove the concept of BackendIds and just use pgprocnos. We
have a fair number of places like SignalVirtualTransaction() that need
to search the procarray just to find the proc to signal based on the
BackendId. If we used pgprocno instead, that'd not be needed.

But perhaps that's a separate thread.

Greetings,

Andres Freund

[1] https://postgr.es/m/20210402002240.56cuz3uo3alnqwae%40alap3.anarazel.de

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Minimal logical decoding on standbys
Next
From: Chris Cleveland
Date:
Subject: Passing data out of indexam or tableam