Home > mailing lists

Re: VM corruption on standby - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: VM corruption on standby
Date	August 19, 2025 21:08:19
Msg-id	599759.1755626899@sss.pgh.pa.us Whole thread Raw
In response to	Re: VM corruption on standby (Kirill Reshke <reshkekirill@gmail.com>)
Responses	Re: VM corruption on standby
List	pgsql-hackers

Tree view

Kirill Reshke <reshkekirill@gmail.com> writes:
> On Tue, 19 Aug 2025 at 21:16, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
>> `if (CritSectionCount != 0) _exit(2) else proc_exit(1)` in
>> WaitEventSetWaitBlock () solves the issue of inconsistency IF POSTMASTER IS
>> SIGKILLED, and doesn't lead to any problem, if postmaster is not SIGKILL-ed
>> (since postmaster will SIGKILL its children).

> This fix was proposed in this thread. It fixes inconsistency but it
> replaces one set of problems with another set, namely systems that
> fail to shut down.

I think a bigger objection is that it'd result in two separate
shutdown behaviors in what's already an extremely under-tested
(and hard to test) scenario.  I don't want to have to deal with
the ensuing state-space explosion.

I still think that proc_exit(1) is fundamentally the wrong thing
to do if the postmaster is gone: that code path assumes that
the cluster is still functional, which is at best shaky.
I concur though that we'd have to do some more engineering work
before _exit(2) would be a practical solution.

In the meantime, it seems like this discussion point arises
only because the presented test case is doing something that
seems pretty unsafe, namely invoking WaitEventSet inside a
critical section.

We'd probably be best off to get back to the actual bug the
thread started with, namely whether we aren't doing the wrong
thing with VM-update order of operations.

            regards, tom lane

pgsql-hackers by date:

From: Andres Freund
Date: 19 August 2025, 21:06:50
Subject: Re: Improve LWLock tranche name visibility across backends

From: "章晨曦"
Date: 19 August 2025, 21:13:01
Subject: Re: Performance issue on temporary relations

Re: VM corruption on standby - Mailing list pgsql-hackers

Previous

Next