Re: BUG #17793: Query with large number of joins crashes PostgreSQL - Mailing list pgsql-bugs

From Stephen Frost
Subject Re: BUG #17793: Query with large number of joins crashes PostgreSQL
Date
Msg-id Y+w8wDqJaaAZrB3E@tamriel.snowman.net
Whole thread Raw
In response to Re: BUG #17793: Query with large number of joins crashes PostgreSQL  (Joe Conway <mail@joeconway.com>)
List pgsql-bugs
Greetings,

* Joe Conway (mail@joeconway.com) wrote:
> On 2/14/23 09:47, Tom Lane wrote:
> > Francisco Olarte <folarte@peoplecall.com> writes:
> > > On Tue, 14 Feb 2023 at 11:29, PG Bug reporting form
> > > <noreply@postgresql.org> wrote:
> > > > It looks like the OS is killing the process due to running OOM, which is not
> > > > very surprising when looking at the query. Is this expected, or should PG
> > > > have guards in place to prevent this from happening?
> >
> > > When you run postgres in an environment where someone ( OOM killer )
> > > can K9 it, protection is hard. IIRC OOM can kill you because ANOTHER
> > > process touches memory, among other things.
> >
> > Yeah.  We have no visibility into what the OOM killer will choose to
> > allow or not allow at any instant.
> >
> > > ( I do run DBs in machines w/ overcommit disabled, this prevents it
> > > from happening, but it is not Pg who prevents it ).
> >
> > If overcommit-disabled doesn't seem practical, another idea that's
> > been recommended is to start the postmaster under a "ulimit -v"
> > setting chosen to cause plain ENOMEM failures before the OOM
> > killer kicks in.

Unfortunately, that ends up being on a per-process basis and therefore
isn't as useful as we'd like it to be (ideally, we'd allow all of the
backends to allocate up until we hit some global "only use X amount of
memory" instead).

> Given:
> > It looks like the OS is killing the process due to running OOM ...
> > I could reproduce it using a fresh docker container of the image
> > "postgres:13.10".
>
> I gather the OP is running Postgres in a container. If so, and if they have
> a memory.limit set (also assuming cgroup v1, otherwise memory.max for cgroup
> v2), disabling overcommit at the host level will not prevent an OOM kill
> when the cgroup exceeds memory.limit

Indeed, we need a way to allow users to specify a limit which PG keeps
track of and prevents PG from hitting the limit.  There's work being
done to both allow users to see how much memory each backend is using as
well as tracking an overall allocation amount in shared memory across
all processes, to allow us to decide to fail an allocation in a given
backend instead of having the kernel decide to run the OOM killer.

Further, once we're managing this, we can also make choices such as
"keep X amount reserved for system processes like autovacuum" and "don't
fail allocations for system processes like autovacuum" or such (just an
example, not saying we necessarily want to do exactly that, but the
point is that it's definitely nicer when we have control instead of
getting hit with a memory allocation failure or worse the OOM killer
deciding to kill -9 PG).

Thanks,

Stephen

Attachment

pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #17760: SCRAM authentication fails with "modern" (rsassaPss signature) server certificate
Next
From: Robins Tharakan
Date:
Subject: Re: BUG #17791: Assert on procarray.c