Re: VM corruption on standby - Mailing list pgsql-hackers

From Andres Freund
Subject Re: VM corruption on standby
Date
Msg-id s4lymuioguv4ir75jqqkl5taos2ft6aps2enjlwnphgb5loihq@llhtt5gkbjpb
Whole thread Raw
In response to Re: VM corruption on standby  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: VM corruption on standby
List pgsql-hackers
Hi,

On 2025-08-19 02:13:43 -0400, Tom Lane wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > On Tue, Aug 19, 2025 at 4:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> But I'm of the opinion that proc_exit
> >> is the wrong thing to use after seeing postmaster death, critical
> >> section or no.  We should assume that system integrity is already
> >> compromised, and get out as fast as we can with as few side-effects
> >> as possible.  It'll be up to the next generation of postmaster to
> >> try to clean up.
> 
> > Then wouldn't backends blocked in LWLockAcquire(x) hang forever, after
> > someone who holds x calls _exit()?
> 
> If someone who holds x is killed by (say) the OOM killer, how do
> we get out of that?

On linux - the primary OS with OOM killer troubles - I'm pretty sure'll lwlock
waiters would get killed due to the postmaster death signal we've configured
(c.f. PostmasterDeathSignalInit()).

A long while back I had experimented with replacing waiting on semaphores
(within lwlocks) with a latch wait. IIRC it was a bit slower under heavy
contention, but that vanished when adding some adaptive spinning to lwlocks -
which is also what we need to make it more feasible to replace some of the
remaining spinlocks...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Florents Tselai
Date:
Subject: Re: mention unused_oids script in pg_proc.dat
Next
From: jian he
Date:
Subject: UPDATE with invalid domain constraint