Re: Crash during backend start when low on memory - Mailing list pgsql-bugs

From Mats Kindahl
Subject Re: Crash during backend start when low on memory
Date
Msg-id CA+14424atYYrCh9rf+VYHWB_uAQoGr_JsrZjVvX2LnNerXpipQ@mail.gmail.com
Whole thread Raw
In response to Re: Crash during backend start when low on memory  (Andres Freund <andres@anarazel.de>)
List pgsql-bugs
On Mon, Feb 6, 2023 at 7:41 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2023-02-06 08:37:17 +0100, Mats Kindahl wrote:
> On Mon, Feb 6, 2023 at 12:18 AM Andres Freund <andres@anarazel.de> wrote:
> > >, but I cannot see how that can cause problems since it is "just"
> > > a setjmp() and longjmp(). It allocates a structure on the stack that
> > > contains the values of the registers and the signal set; it is big, but
> > not
> > > exceedingly so. If somebody could enlighten me about if there are any
> > > problems with this, I would be very grateful.
> >
> > If you aren't careful you end up with PG_exception_stack in a backend
> > still pointing to that PG_TRY. Which means that an error during backend
> > startup would jump into postmaster code.  Which would not be good.
> >
>
> I did run into this case when "failing" some of the memory allocations
> inside the backend with a debugger (see the PATCH.md).
>
> There are, however, already cases in the backend startup that can raise an
> error (for example, the code that creates the MessageContext in
> PostgresMain). So, either if one of these error cases is triggered, or if
> you make a mistake and add an ereport(ERROR,...) at the wrong place during
> startup, it could cause this problem. In other words, wouldn't adding a
> PG_TRY() in the code spawning the backend *prevent* such a problem and
> hence reduce the risk of a mistake in code causing the backend jumping into
> postmaster code (of course assuming that the code is written correctly).

It's fine to add ereport(ERROR)s inside the backend startup - elog.c knows how
to deal with no error handlers being set up. We just promote the ERROR to a
FATAL. That's not at all the same as having a PG_exception_stack set up at a
point we don't want to return to.

Remember that backends are forked, so they inherit the stack that postmaster
has set up.


> > Particularly because this is a bugfix we need to make the fix more
> > minimal than the patches proposed so far.
> >
>
> I can remove the PG_TRY() and use ereport(LOG, ...) + ExitPostmaster if it
> feels like a safer path for incorporating a bug fix, but note that there is
> still a risk that the backend will tear down the postmaster. For example,
> allocating memory for a memory context can fail, or throwing another error
> inside the backend startup can also fail, so I think that there will be
> more work to do after this.

I can't really follow.

My apologies, I'll add my reasoning below.

However, since you wanted a patch not using PG_TRY for the bugfix I have attached a patch that does not use it and instead uses memory context functions and the MCXT_ALLOC_NO_OOM flag where applicable. I've tested it manually similarly to the other patches and it passes "make check" and "make installcheck". I have added comments to explain my reasoning for the important allocations: what memory context it is allocated in and why.

My reasoning was that using a PG_TRY around the call to BackendStartup() would prevent any errors generated now and in future code from killing the postmaster, so it would be a more safe alternative. The difference in size is not that big between the patches, but this patch does not use exception handling.

Best wishes,
Mats Kindahl


Greetings,

Andres Freund
Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)
Next
From: Keyerror Smart
Date:
Subject: Re: BUG #17812: LOCK TABLE IN ACCESS EXCLUSIVE MODE with a view returns an empty tuple set