Thread: BUG #17793: Query with large number of joins crashes PostgreSQL

BUG #17793: Query with large number of joins crashes PostgreSQL

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17793
Logged by:          Tom
Email address:      tom@quist.de
PostgreSQL version: 13.10
Operating system:   Debian 10.2.1-6 (Docker image postgres:13.10)
Description:

Hey,

we had a unit test that produced a similar query to the following one and
caused Postgres to crash and reinitialize with automatic recovery:
https://gist.github.com/tomquist/a58b62d1a15ce99a67657ed636129e9b

These are the client logs:
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

These are the server logs when the crash happens:
2023-02-14 09:17:31.440 UTC [1] LOG:  server process (PID 35) was terminated
by signal 9: Killed
2023-02-14 09:17:31.440 UTC [1] DETAIL:  Failed process was running: with
app_user as (select 'a' as id, 'organizationId' as organization_id,
'Africa/Abidjan' as timezone),
    employee as (select 'a' as id, 'organizationId' as organization_id, 'a' as
user_id, 'a' as manager_id)
    SELECT DISTINCT app_user.id FROM app_user LEFT OUTER JOIN employee AS
employee_1 ON 
    employee_1.user_id = app_user.id LEFT OUTER JOIN employee AS
manager_employee_1 ON (employee_1.manager_id = manager_employee_1.id AND
manager_employee_1.user_id IN (SELECT app_user.id FROM app_user WHERE
(app_user.timezone = 'Africa/Abidjan' AND app_user.organization_id =
'organizationId')))
     LEFT OUTER JOIN employee AS employee_2 ON 
    employee_2.user_id = app_user.id LEFT OUTER JOIN employee AS
manager_employee_2 ON (employee_2.manager_id = manager_employee_2.id AND
manager_employee_2.user_id IN (SELECT app_user.id FROM app_user WHERE
(app_user.timezone = 'Africa/Algiers' AND app_user.organization_id =
'organizationId')))
     LEFT OUTER JOIN employee AS employee_3 ON 
    employee_3.user_id = app_user.id LEFT OUTER JOIN employee
2023-02-14 09:17:31.442 UTC [1] LOG:  terminating any other active server
processes
2023-02-14 09:17:31.446 UTC [32] WARNING:  terminating connection because of
crash of another server process
2023-02-14 09:17:31.446 UTC [32] DETAIL:  The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2023-02-14 09:17:31.446 UTC [32] HINT:  In a moment you should be able to
reconnect to the database and repeat your command.
2023-02-14 09:17:31.461 UTC [37] FATAL:  the database system is in recovery
mode
2023-02-14 09:17:31.522 UTC [1] LOG:  all server processes terminated;
reinitializing
2023-02-14 09:17:31.546 UTC [38] LOG:  database system was interrupted; last
known up at 2023-02-14 09:16:05 UTC
2023-02-14 09:17:31.779 UTC [38] LOG:  database system was not properly shut
down; automatic recovery in progress
2023-02-14 09:17:31.781 UTC [38] LOG:  redo starts at 0/15D7070
2023-02-14 09:17:31.782 UTC [38] LOG:  invalid record length at 0/15D70A8:
wanted 24, got 0
2023-02-14 09:17:31.782 UTC [38] LOG:  redo done at 0/15D7070
2023-02-14 09:17:31.791 UTC [1] LOG:  database system is ready to accept
connections

It looks like the OS is killing the process due to running OOM, which is not
very surprising when looking at the query. Is this expected, or should PG
have guards in place to prevent this from happening?

I could reproduce it using a fresh docker container of the image
"postgres:13.10".
SELECT version();
> PostgreSQL 13.10 (Debian 13.10-1.pgdg110+1) on aarch64-unknown-linux-gnu,
compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit


Re: BUG #17793: Query with large number of joins crashes PostgreSQL

From
Francisco Olarte
Date:
Tom:

On Tue, 14 Feb 2023 at 11:29, PG Bug reporting form
<noreply@postgresql.org> wrote:
...
> These are the server logs when the crash happens:
> 2023-02-14 09:17:31.440 UTC [1] LOG:  server process (PID 35) was terminated
> by signal 9: Killed
...
> It looks like the OS is killing the process due to running OOM, which is not
> very surprising when looking at the query. Is this expected, or should PG
> have guards in place to prevent this from happening?

When you run postgres in an environment where someone ( OOM killer )
can K9 it, protection is hard. IIRC OOM can kill you because ANOTHER
process touches memory, among other things.

( I do run DBs in machines w/ overcommit disabled, this prevents it
from happening, but it is not Pg who prevents it ).

FOS.



Re: BUG #17793: Query with large number of joins crashes PostgreSQL

From
Tom Lane
Date:
Francisco Olarte <folarte@peoplecall.com> writes:
> On Tue, 14 Feb 2023 at 11:29, PG Bug reporting form
> <noreply@postgresql.org> wrote:
>> It looks like the OS is killing the process due to running OOM, which is not
>> very surprising when looking at the query. Is this expected, or should PG
>> have guards in place to prevent this from happening?

> When you run postgres in an environment where someone ( OOM killer )
> can K9 it, protection is hard. IIRC OOM can kill you because ANOTHER
> process touches memory, among other things.

Yeah.  We have no visibility into what the OOM killer will choose to
allow or not allow at any instant.

> ( I do run DBs in machines w/ overcommit disabled, this prevents it
> from happening, but it is not Pg who prevents it ).

If overcommit-disabled doesn't seem practical, another idea that's
been recommended is to start the postmaster under a "ulimit -v"
setting chosen to cause plain ENOMEM failures before the OOM
killer kicks in.

            regards, tom lane



Re: BUG #17793: Query with large number of joins crashes PostgreSQL

From
Joe Conway
Date:
On 2/14/23 09:47, Tom Lane wrote:
> Francisco Olarte <folarte@peoplecall.com> writes:
>> On Tue, 14 Feb 2023 at 11:29, PG Bug reporting form
>> <noreply@postgresql.org> wrote:
>>> It looks like the OS is killing the process due to running OOM, which is not
>>> very surprising when looking at the query. Is this expected, or should PG
>>> have guards in place to prevent this from happening?
> 
>> When you run postgres in an environment where someone ( OOM killer )
>> can K9 it, protection is hard. IIRC OOM can kill you because ANOTHER
>> process touches memory, among other things.
> 
> Yeah.  We have no visibility into what the OOM killer will choose to
> allow or not allow at any instant.
> 
>> ( I do run DBs in machines w/ overcommit disabled, this prevents it
>> from happening, but it is not Pg who prevents it ).
> 
> If overcommit-disabled doesn't seem practical, another idea that's
> been recommended is to start the postmaster under a "ulimit -v"
> setting chosen to cause plain ENOMEM failures before the OOM
> killer kicks in.

Given:
> It looks like the OS is killing the process due to running OOM ...
> I could reproduce it using a fresh docker container of the image
> "postgres:13.10".

I gather the OP is running Postgres in a container. If so, and if they 
have a memory.limit set (also assuming cgroup v1, otherwise memory.max 
for cgroup v2), disabling overcommit at the host level will not prevent 
an OOM kill when the cgroup exceeds memory.limit


-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: BUG #17793: Query with large number of joins crashes PostgreSQL

From
Francisco Olarte
Date:
Hi Tom:

On Tue, 14 Feb 2023 at 15:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> If overcommit-disabled doesn't seem practical, another idea that's
> been recommended is to start the postmaster under a "ulimit -v"
> setting chosen to cause plain ENOMEM failures before the OOM
> killer kicks in.

This may certainly help, but if I read the oomkiller stuff correctly
you can have a case where Pg has its (big) share of memory allocated
and commited, some other processes have overallocated and have
uncommitted memory and when one of these tries to commit some mem the
OOM kills Pg because it is the biggest memory user. Not an covercommti
fan, so not too sure.

FOS



Re: BUG #17793: Query with large number of joins crashes PostgreSQL

From
Stephen Frost
Date:
Greetings,

* Joe Conway (mail@joeconway.com) wrote:
> On 2/14/23 09:47, Tom Lane wrote:
> > Francisco Olarte <folarte@peoplecall.com> writes:
> > > On Tue, 14 Feb 2023 at 11:29, PG Bug reporting form
> > > <noreply@postgresql.org> wrote:
> > > > It looks like the OS is killing the process due to running OOM, which is not
> > > > very surprising when looking at the query. Is this expected, or should PG
> > > > have guards in place to prevent this from happening?
> >
> > > When you run postgres in an environment where someone ( OOM killer )
> > > can K9 it, protection is hard. IIRC OOM can kill you because ANOTHER
> > > process touches memory, among other things.
> >
> > Yeah.  We have no visibility into what the OOM killer will choose to
> > allow or not allow at any instant.
> >
> > > ( I do run DBs in machines w/ overcommit disabled, this prevents it
> > > from happening, but it is not Pg who prevents it ).
> >
> > If overcommit-disabled doesn't seem practical, another idea that's
> > been recommended is to start the postmaster under a "ulimit -v"
> > setting chosen to cause plain ENOMEM failures before the OOM
> > killer kicks in.

Unfortunately, that ends up being on a per-process basis and therefore
isn't as useful as we'd like it to be (ideally, we'd allow all of the
backends to allocate up until we hit some global "only use X amount of
memory" instead).

> Given:
> > It looks like the OS is killing the process due to running OOM ...
> > I could reproduce it using a fresh docker container of the image
> > "postgres:13.10".
>
> I gather the OP is running Postgres in a container. If so, and if they have
> a memory.limit set (also assuming cgroup v1, otherwise memory.max for cgroup
> v2), disabling overcommit at the host level will not prevent an OOM kill
> when the cgroup exceeds memory.limit

Indeed, we need a way to allow users to specify a limit which PG keeps
track of and prevents PG from hitting the limit.  There's work being
done to both allow users to see how much memory each backend is using as
well as tracking an overall allocation amount in shared memory across
all processes, to allow us to decide to fail an allocation in a given
backend instead of having the kernel decide to run the OOM killer.

Further, once we're managing this, we can also make choices such as
"keep X amount reserved for system processes like autovacuum" and "don't
fail allocations for system processes like autovacuum" or such (just an
example, not saying we necessarily want to do exactly that, but the
point is that it's definitely nicer when we have control instead of
getting hit with a memory allocation failure or worse the OOM killer
deciding to kill -9 PG).

Thanks,

Stephen

Attachment