On Mon, Feb 6, 2023 at 7:41 PM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2023-02-06 08:37:17 +0100, Mats Kindahl wrote: > On Mon, Feb 6, 2023 at 12:18 AM Andres Freund <andres@anarazel.de> wrote: > > >, but I cannot see how that can cause problems since it is "just" > > > a setjmp() and longjmp(). It allocates a structure on the stack that > > > contains the values of the registers and the signal set; it is big, but > > not > > > exceedingly so. If somebody could enlighten me about if there are any > > > problems with this, I would be very grateful. > > > > If you aren't careful you end up with PG_exception_stack in a backend > > still pointing to that PG_TRY. Which means that an error during backend > > startup would jump into postmaster code. Which would not be good. > > > > I did run into this case when "failing" some of the memory allocations > inside the backend with a debugger (see the PATCH.md). > > There are, however, already cases in the backend startup that can raise an > error (for example, the code that creates the MessageContext in > PostgresMain). So, either if one of these error cases is triggered, or if > you make a mistake and add an ereport(ERROR,...) at the wrong place during > startup, it could cause this problem. In other words, wouldn't adding a > PG_TRY() in the code spawning the backend *prevent* such a problem and > hence reduce the risk of a mistake in code causing the backend jumping into > postmaster code (of course assuming that the code is written correctly).
It's fine to add ereport(ERROR)s inside the backend startup - elog.c knows how to deal with no error handlers being set up. We just promote the ERROR to a FATAL. That's not at all the same as having a PG_exception_stack set up at a point we don't want to return to.
Remember that backends are forked, so they inherit the stack that postmaster has set up.
> > Particularly because this is a bugfix we need to make the fix more > > minimal than the patches proposed so far. > > > > I can remove the PG_TRY() and use ereport(LOG, ...) + ExitPostmaster if it > feels like a safer path for incorporating a bug fix, but note that there is > still a risk that the backend will tear down the postmaster. For example, > allocating memory for a memory context can fail, or throwing another error > inside the backend startup can also fail, so I think that there will be > more work to do after this.
I can't really follow.
My apologies, I'll add my reasoning below.
However, since you wanted a patch not using PG_TRY for the bugfix I have attached a patch that does not use it and instead uses memory context functions and the MCXT_ALLOC_NO_OOM flag where applicable. I've tested it manually similarly to the other patches and it passes "make check" and "make installcheck". I have added comments to explain my reasoning for the important allocations: what memory context it is allocated in and why.
My reasoning was that using a PG_TRY around the call to BackendStartup() would prevent any errors generated now and in future code from killing the postmaster, so it would be a more safe alternative. The difference in size is not that big between the patches, but this patch does not use exception handling.