Thread: Process initialization labyrinth
Hi, While working on the shared memory stats patch I (not for the first time), issues with our process initialization. The concrete issue was that I noticed that some stats early in startup weren't processed correctly - the stats system wasn't initialized yet. I consequently added assertions ensuring that we don't try to report stats before that. Which blew up. Even in master we report stats well before the pgstat_initialize() call. E.g. in autovac workers: /* * Report autovac startup to the stats collector. We deliberately do * this before InitPostgres, so that the last_autovac_time will get * updated even if the connection attempt fails. This is to prevent * autovac from getting "stuck" repeatedly selecting an unopenable * database, rather than making any progress on stuff it can connect * to. */ That previously just didn't cause a problem, because we didn't really need pgstat_initialize() to have happened for stats reporting to work. In the shared memory stats patch there's no dependency on pgstat_initialize() knowing MyBackendId anymore (broken out to a separate function). So I tried moving the stats initialization to somewhere earlier. There currently is simply no way of doing that that doesn't cause duplication, or weird conditions. We can't do it in: - InitProcess()/InitAuxiliaryProcess(), CreateSharedMemoryAndSemaphores() hasn't yet run in EXEC_BACKEND - below CreateSharedMemoryAndSemaphores(), as that isn't called for each backend in !EXEC_BACKEND - InitPostgres(), because autovac workers report stats before that - BaseInit(), because it's called before we have a PROC iff !EXEC_BACKEND - ... I have now worked around this by generous application of ugly, but I think we really need to do something about this mazy mess. There's just an insane amount of duplication, and it's too complicated to remember more than a few minutes. I really would like to not see things like /* * Create a per-backend PGPROC struct in shared memory, except in the * EXEC_BACKEND case where this was done in SubPostmasterMain. We must do * this before we can use LWLocks (and in the EXEC_BACKEND case we already * had to do some stuff with LWLocks). */ #ifdef EXEC_BACKEND if (!IsUnderPostmaster) InitProcess(); #else InitProcess(); #endif Similarly, codeflow like bootstrap.c being involved in bog standard stuff like starting up wal writer etc, is just pointlessly confusing. Note that bootstrap itself does *not* go through AuxiliaryProcessMain(), and thus has yet another set of initialization needs. Greetings, Andres Freund
On Thu, Apr 1, 2021 at 8:22 PM Andres Freund <andres@anarazel.de> wrote: > <snip> > I have now worked around this by generous application of ugly, but I > think we really need to do something about this mazy mess. There's just > an insane amount of duplication, and it's too complicated to remember > more than a few minutes. > > I really would like to not see things like > > /* > * Create a per-backend PGPROC struct in shared memory, except in the > * EXEC_BACKEND case where this was done in SubPostmasterMain. We must do > * this before we can use LWLocks (and in the EXEC_BACKEND case we already > * had to do some stuff with LWLocks). > */ > #ifdef EXEC_BACKEND > if (!IsUnderPostmaster) > InitProcess(); > #else > InitProcess(); > #endif > > Similarly, codeflow like bootstrap.c being involved in bog standard > stuff like starting up wal writer etc, is just pointlessly > confusing. Note that bootstrap itself does *not* go through > AuxiliaryProcessMain(), and thus has yet another set of initialization > needs. I can't really speak to the initial points directly relating to pgstat/shmem, but for the overall maze-like nature of the startup code: is there any chance the startup centralization patchset would be of any help here? https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3Q5RAoFQgW8EtA@mail.gmail.com I know you are at least vaguely aware of the efforts there, as you reviewed the patchset. Figured I should at least bring it up in case it seemed helpful, or an effort you'd like to re-invigorate. Thanks, -- Mike Palmiotto https://crunchydata.com
Hi, On 2021-04-02 10:18:20 -0400, Mike Palmiotto wrote: > I can't really speak to the initial points directly relating to > pgstat/shmem, but for the overall maze-like nature of the startup > code: is there any chance the startup centralization patchset would be > of any help here? > https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3Q5RAoFQgW8EtA@mail.gmail.com I think parts of it could help, at least. It doesn't really do much about centralizing / de-mazing the actual initialization of sub-systems, but it'd make it a bit easier to do so. Greetings, Andres Freund
On Fri, Apr 2, 2021 at 1:02 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2021-04-02 10:18:20 -0400, Mike Palmiotto wrote: > > I can't really speak to the initial points directly relating to > > pgstat/shmem, but for the overall maze-like nature of the startup > > code: is there any chance the startup centralization patchset would be > > of any help here? > > https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3Q5RAoFQgW8EtA@mail.gmail.com > > I think parts of it could help, at least. It doesn't really do much > about centralizing / de-mazing the actual initialization of sub-systems, > but it'd make it a bit easier to do so. The patchset needs some work, no doubt. If you think it'd be useful, I can spend some of my free time addressing any gaps in the design. I'd hate to see that code go to waste, as I think it may have been a reasonable first step. Also not opposed to you taking the patchset and running with it if you prefer. Thanks, -- Mike Palmiotto https://crunchydata.com