Re: Refactoring postmaster's code to cleanup after child exit - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Refactoring postmaster's code to cleanup after child exit
Date
Msg-id 8f2118b9-79e3-4af7-b2c9-bd5818193ca4@iki.fi
Whole thread Raw
In response to Re: Refactoring postmaster's code to cleanup after child exit  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Refactoring postmaster's code to cleanup after child exit
List pgsql-hackers
I committed the first two trivial patches, and have continued to work on 
postmaster.c, and how it manages all the child processes.

This is a lot of patches. They're built on top of each other, because 
that's the order I developed them in, but they probably could be applied 
in different order too. Please help me by reviewing these, before the 
stack grows even larger :-). Even partial reviews would be very helpful. 
I suggest to start reading them in order, and when you get tired, just 
send any comments you have up to that point.


* v3-0001-Make-BackgroundWorkerList-doubly-linked.patch

This is the same refactoring patch I started this thread with.

* v3-0003-Fix-comment-on-processes-being-kept-over-a-restar.patch
* v3-0004-Consolidate-postmaster-code-to-launch-background-.patch

Little refactoring of how postmaster launches the background processes.

* v3-0005-Add-test-for-connection-limits.patch
* v3-0006-Add-test-for-dead-end-backends.patch

A few new TAP tests for dead-end backends and enforcing connection 
limits. We didn't have much coverage for these before.

* v3-0007-Use-an-shmem_exit-callback-to-remove-backend-from.patch
* v3-0008-Introduce-a-separate-BackendType-for-dead-end-chi.patch

Some preliminary refactoring towards patch 
v3-0010-Assign-a-child-slot-to-every-postmaster-child-pro.patch

* v3-0009-Kill-dead-end-children-when-there-s-nothing-else-.patch

I noticed that we never send SIGTERM or SIGQUIT to dead-end backends, 
which seems silly. If the server is shutting down, dead-end backends 
might prevent the shutdown from completing. Dead-end backends will 
expire after authentication_timoeut (default 60s), so it won't last for 
too long, but still seems like we should kill dead-end backends if 
they're the only children preventing shutdown from completing.

* 3-0010-Assign-a-child-slot-to-every-postmaster-child-pro.patch

This is what I consider the main patch in this series. Currently, only 
regular backens, bgworkers and autovacuum workers have a PMChildFlags 
slot, which is used to detect when a postmaster child exits in an 
unclean way (in addition to the exit code). This patch assigns a child 
slot for all processes, except for dead-end backends. That includes all 
the aux processes.

While we're at it, I created separate pools of child slots for different 
kinds of backends, which fixes the issue that opening a lot of client 
connections can exhaust all the slots, so that background workers or 
autovacuum workers cannot start either [1].

[1] 
https://www.postgresql.org/message-id/55d2f50c-0b81-4b33-b202-cd2a406d69a3%40iki.fi

* v3-0011-Pass-MyPMChildSlot-as-an-explicit-argument-to-chi.patch

One more little refactoring, to pass MyPMChildSlot to the child process 
differently.


Where is all this leading? I'm not sure exactly, but having a postmaster 
child slot for every postmaster child seems highly useful. We could move 
the ProcSignal machinery to use those slot numbers for the indexes to 
the ProcSignal array, instead of ProcSignal, for example. That would 
allow all processes to participate in the signalling, even before they 
have a PGPROC entry. (Or with Thomas's interrupts refactoring, the 
interrupts array). With the multithreading work, PMChild struct could 
store a thread id, or whatever is needed for threads to communicate with 
each other. In any case, seems like it will come handy.

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Casts from jsonb to other types should cope with json null
Next
From: Robert Haas
Date:
Subject: Re: can we mark upper/lower/textlike functions leakproof?