Re: Non-robustness in pmsignal.c - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Non-robustness in pmsignal.c
Date
Msg-id 20221008002829.u6iqt7imdlg72kxq@awork3.anarazel.de
Whole thread Raw
In response to Non-robustness in pmsignal.c  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Non-robustness in pmsignal.c  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2022-10-07 19:57:35 -0400, Tom Lane wrote:
> As I mentioned in another thread, I came across a reproducible
> situation in which a memory clobber in a child backend crashes
> the postmaster too, at least on FreeBSD/arm64.  Needless to say,
> this is Not Cool.

Ugh.


> I've now traced down what is happening, and it's this:
> 
> 1. Careless coding in aset.c causes it to decide to wipe_mem
> the universe.  (I'll have more to say about that separately;
> the point of this thread is keeping the postmaster alive
> afterwards.)  Apparently, there's not any non-live memory
> space between process-local memory and shared memory on this
> platform, so the failing backend manages to trash shared memory
> too before it finally hits SIGSEGV.

Perhaps it'd be worth mark a page or two inaccessible, via
mprotect(PROT_NONE), at the start and end of shared memory. I've wondered
about a debugging mode where we do that after separate shared memory
allocations even. But start/end would be something we could conceivably always
enable.


> 2. Most of the background processes die on something like
> 
> TRAP: FailedAssertion("latch->owner_pid == MyProcPid", File: "latch.c", Line: 686, PID: 5916)
> 
> or they encounter what seems to be a stuck spinlock.  The postmaster,
> however, SIGSEGVs.  It's not supposed to do that; it is supposed to
> be sufficiently arms-length from shared memory that it can recover
> despite a backend trashing shared memory contents.

> 3. The cause of the SIGSEGV is that AssignPostmasterChildSlot
> naively believes that it can trust PMSignalState->next_child_flag
> to be a valid array index, so after that's been clobbered with
> something like 0x7f7f7f7f we index off the end of memory.
> I see no good reason for that state variable to be in shared memory
> at all, so the attached patch just moves it to postmaster static
> data.  We also need a less-exposed copy of the array size variable.

Those make sense to me.


> 4. That's enough to stop the SIGSEGV crash, but the postmaster
> still fails to recover, because then it hits
> 
>     elog(FATAL, "no free slots in PMChildFlags array");
> 
> since all of the array entries have been clobbered as well.
> In the attached patch, I fixed this by treating the case similarly
> to failure to fork a new child process.  This seems to be enough
> to let the postmaster survive, and recover after it starts noticing
> crashing children.

Why are we even tracking PM_CHILD_UNUSED / PM_CHILD_ASSIGNED in shared memory?
ISTM those should live in postmaster local memory (maybe copied to shared
memory). PM_CHILD_ACTIVE and PM_CHILD_WALSENDER do have to live in shared
memory, but ...

Your fix seems ok. We really ought to deduplicate the way we start postmaster
children, but that's obviously work for another day.


> 5. It's possible that we should take some proactive steps to get out
> of the "no free slots" situation, rather than just wait for some
> child to crash.  I'm inclined not to, however.  It'd be hard-to-test
> corner-case code, and given the lack of field reports like this,
> the situation must be awfully rare.

Agreed.


Are you thinking these should be backpatched?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Non-robustness in pmsignal.c
Next
From: Tom Lane
Date:
Subject: Re: Non-robustness in pmsignal.c