Home > mailing lists

Re: Crash during backend start when low on memory - Mailing list pgsql-bugs

From	Alvaro Herrera
Subject	Re: Crash during backend start when low on memory
Date	December 15, 2022 17:20:08
Msg-id	20221215172008.islvqfyysom3ufxy@alvherre.pgsql Whole thread Raw
In response to	Re: Crash during backend start when low on memory (Mats Kindahl <mats@timescale.com>)
Responses	Re: Crash during backend start when low on memory
List	pgsql-bugs

Tree view

On 2022-Dec-14, Mats Kindahl wrote:

> PostmasterContext is set quite early in the procedure so it should not be a
> problem to use it. I think we can use PostmasterContext rather than
> TopMemoryContext. The memory for the allocations are passed down to
> PostmasterMain() and used inside InitPostgres() but as far as I can tell,
> they are not used after the InitPostgres() call. The PostmasterContex is
> destroyed right after the call to InitPostgres() inside PostmasterMain()
> and then the rest is allocated in other contexts.

Well, this is what PostgresMain has to say about it:

    /*
     * If the PostmasterContext is still around, recycle the space; we don't
     * need it anymore after InitPostgres completes.  Note this does not trash
     * *MyProcPort, because ConnCreate() allocated that space with malloc()
     * ... else we'd need to copy the Port data first.  Also, subsidiary data
     * such as the username isn't lost either; see ProcessStartupPacket().
     */
    if (PostmasterContext)
    {
        MemoryContextDelete(PostmasterContext);

So I think blowing away part of struct Port independently of the
containing struct is a bad idea.  Why isn't more appropriate to do what
ConnCreate does?

    if (!(port = (Port *) calloc(1, sizeof(Port))))
    {
        ereport(LOG,
                (errcode(ERRCODE_OUT_OF_MEMORY),
                 errmsg("out of memory")));
        ExitPostmaster(1);
    }

... ugh.  I have to admit that killing postmaster due to a transient
out-of-memory makes me a bit nervous -- do note that this
ExitPostmaster() happens before we've forked the backend, so this code
does bring the server down (!!).  If we want to make postmaster more
robust in the face of OOM, we should also change this code somehow, yes?

I agree with Mats' premise when starting the thread -- even if the
server is not in great shape, it doesn't seem like taking the whole
service down is a great response.  It would be better to just refuse the
connection with some error and let existing connections chug along,
poorly though they can.  Maybe they'll also die soon, freeing memory;
but that'll be no good if Postmaster is gone altogether.

Now, it would be better if we could test this more thoroughly.  Perhaps
it's possible to use something like the mallocfail.c lib that Heikki
wrote a few years ago, teaching it to handle calloc
https://postgr.es/m/547480DE.4040408@vmware.com

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible" (Calvin a la TV)

pgsql-bugs by date:

From: reiner peterke
Date: 15 December 2022, 17:02:26
Subject: Re: BUG #17720: pg_dump creates a dump with primary key that cannot be restored, when specifying 'using index ...'

From: Tom Lane
Date: 15 December 2022, 18:56:30
Subject: Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)

Re: Crash during backend start when low on memory - Mailing list pgsql-bugs

Previous

Next