Re: Crash during backend start when low on memory - Mailing list pgsql-bugs

From Mats Kindahl
Subject Re: Crash during backend start when low on memory
Date
Msg-id CA+14426M6j-YrkNLu9Hsjerbh3nXaz1gQ-Di+L+EaDoZtkdMVg@mail.gmail.com
Whole thread Raw
In response to Re: Crash during backend start when low on memory  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Crash during backend start when low on memory  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Fri, Jan 13, 2023 at 4:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2023-Jan-13, Mats Kindahl wrote:
>> I have an improved patch based on raising an error when malloc fails.

> Hmm, but I'm not sure I like what you did to BackendStartup:

Yeah, the BackendStartup change is 100% wrong; it is replacing
perfectly good code that recovers correctly with bad code that
will take down the postmaster (not a backend child!) on OOM.

AFAICT, the error is caught by the caller (using PG_TRY), executes some cleanup code, and then continues executing, so it shouldn't take down the postmaster.
 

By and large I don't like anything about this patch.  You could
get the same results (i.e. elog(ERROR) on OOM) by replacing the
malloc calls with pallocs.

Well... as Alvaro pointed out, just replacing it with palloc will blow out part of the structure being allocated for the port information when the backend starts running.

However, continuing on an approach using memory contexts instead of directly allocating memory, something like this could be done:
  1. There is already a memory context PostmasterContext for the postmaster startup code, so only allocate stuff here that is not needed after a backend has started.
  2. Add a new memory context for "backend startup" (say, BackendStartupContext) and use that for allocating stuff that should exist after the backend starts. For example, the port information. (It might be possible to move more information here, such as the HBA structures, which would make it clearer that this is startup memory for the backend. Not sure if there are some potential issues with this.)
  3. Create the BackendStartupContext when starting the backend.
  4. Replace previous malloc and friends with palloc in the PostmasterContext (where suitable).
  5. Replace allocation of port structure using malloc with allocation in BackendStartupContext
  6. When the backend starts, it would toss the memory context from the postmaster, but that would leave the port structure intact. The backend already tosses the PostmasterContext, but not until after it has done authentication.
  7. When the postmaster has spawned the backend, it should toss the BackendStartupContext since it is not relevant any more.
  8. Since creating the BackendStartupContext can fail with an error, we need to wrap this in a PG_TRY block. Otherwise the postmaster will exit on error.
This provides a few advantages compared to allocating memory using malloc and friends:
  1. We allocate memory for the postmaster context in one place, when the actual startup is running. If we're low on memory at this point, we can abort early.
  2. We allocate memory for the backend startup in one place, when starting the backend. If we're low on memory at this point, we can abort early.
  3. Code is significantly easier to read and will not require a lot of checks during startup.
I have a tentative patch, but it contains some problems so I will solve these first.

Best wishes,
Mats Kindahl


                        regards, tom lane

pgsql-bugs by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: BUG #17698: On SIGTERM, psql terminates, but leaves the statement running
Next
From: Tom Lane
Date:
Subject: Re: Crash during backend start when low on memory