On Tue, Jan 23, 2024 at 06:33:25PM +0100, Alvaro Herrera wrote:
> On 2024-Jan-22, Nathan Bossart wrote:
>> This might be a topic for another thread, but I do wonder whether we could
>> put a generic pg_controldata check in node->stop that would at least make
>> sure that the state is some sort of expected shut-down state. My first
>> thought is that it could be a tad expensive, but... maybe it wouldn't be
>> too bad.
>
> Does this actually detect a problem if you take out the fix? I think
> what's going to happen is that postmaster is going to crash, then do the
> recovery cycle, then stop as instructed by the test; so pg_controldata
> would report that it was correctly shut down.
Yes, the control data shows "in production" without it. The segfault
happens within the shut-down path, and the test logs indicate that the
server continues shutting down without doing a recovery cycle:
2024-01-23 12:14:49.254 CST [2376301] LOG: received fast shutdown request
2024-01-23 12:14:49.254 CST [2376301] LOG: aborting any active transactions
2024-01-23 12:14:49.255 CST [2376301] LOG: background worker "logical replication launcher" (PID 2376308) exited with
exitcode 1
2024-01-23 12:14:49.256 CST [2376301] LOG: background worker "autoprewarm leader" (PID 2376307) was terminated by
signal11: Segmentation fault
2024-01-23 12:14:49.256 CST [2376301] LOG: terminating any other active server processes
2024-01-23 12:14:49.257 CST [2376301] LOG: abnormal database system shutdown
2024-01-23 12:14:49.261 CST [2376301] LOG: database system is shut down
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com