Thread: BUG #17793: Query with large number of joins crashes PostgreSQL
The following bug has been logged on the website: Bug reference: 17793 Logged by: Tom Email address: tom@quist.de PostgreSQL version: 13.10 Operating system: Debian 10.2.1-6 (Docker image postgres:13.10) Description: Hey, we had a unit test that produced a similar query to the following one and caused Postgres to crash and reinitialize with automatic recovery: https://gist.github.com/tomquist/a58b62d1a15ce99a67657ed636129e9b These are the client logs: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. These are the server logs when the crash happens: 2023-02-14 09:17:31.440 UTC [1] LOG: server process (PID 35) was terminated by signal 9: Killed 2023-02-14 09:17:31.440 UTC [1] DETAIL: Failed process was running: with app_user as (select 'a' as id, 'organizationId' as organization_id, 'Africa/Abidjan' as timezone), employee as (select 'a' as id, 'organizationId' as organization_id, 'a' as user_id, 'a' as manager_id) SELECT DISTINCT app_user.id FROM app_user LEFT OUTER JOIN employee AS employee_1 ON employee_1.user_id = app_user.id LEFT OUTER JOIN employee AS manager_employee_1 ON (employee_1.manager_id = manager_employee_1.id AND manager_employee_1.user_id IN (SELECT app_user.id FROM app_user WHERE (app_user.timezone = 'Africa/Abidjan' AND app_user.organization_id = 'organizationId'))) LEFT OUTER JOIN employee AS employee_2 ON employee_2.user_id = app_user.id LEFT OUTER JOIN employee AS manager_employee_2 ON (employee_2.manager_id = manager_employee_2.id AND manager_employee_2.user_id IN (SELECT app_user.id FROM app_user WHERE (app_user.timezone = 'Africa/Algiers' AND app_user.organization_id = 'organizationId'))) LEFT OUTER JOIN employee AS employee_3 ON employee_3.user_id = app_user.id LEFT OUTER JOIN employee 2023-02-14 09:17:31.442 UTC [1] LOG: terminating any other active server processes 2023-02-14 09:17:31.446 UTC [32] WARNING: terminating connection because of crash of another server process 2023-02-14 09:17:31.446 UTC [32] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2023-02-14 09:17:31.446 UTC [32] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2023-02-14 09:17:31.461 UTC [37] FATAL: the database system is in recovery mode 2023-02-14 09:17:31.522 UTC [1] LOG: all server processes terminated; reinitializing 2023-02-14 09:17:31.546 UTC [38] LOG: database system was interrupted; last known up at 2023-02-14 09:16:05 UTC 2023-02-14 09:17:31.779 UTC [38] LOG: database system was not properly shut down; automatic recovery in progress 2023-02-14 09:17:31.781 UTC [38] LOG: redo starts at 0/15D7070 2023-02-14 09:17:31.782 UTC [38] LOG: invalid record length at 0/15D70A8: wanted 24, got 0 2023-02-14 09:17:31.782 UTC [38] LOG: redo done at 0/15D7070 2023-02-14 09:17:31.791 UTC [1] LOG: database system is ready to accept connections It looks like the OS is killing the process due to running OOM, which is not very surprising when looking at the query. Is this expected, or should PG have guards in place to prevent this from happening? I could reproduce it using a fresh docker container of the image "postgres:13.10". SELECT version(); > PostgreSQL 13.10 (Debian 13.10-1.pgdg110+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
Tom: On Tue, 14 Feb 2023 at 11:29, PG Bug reporting form <noreply@postgresql.org> wrote: ... > These are the server logs when the crash happens: > 2023-02-14 09:17:31.440 UTC [1] LOG: server process (PID 35) was terminated > by signal 9: Killed ... > It looks like the OS is killing the process due to running OOM, which is not > very surprising when looking at the query. Is this expected, or should PG > have guards in place to prevent this from happening? When you run postgres in an environment where someone ( OOM killer ) can K9 it, protection is hard. IIRC OOM can kill you because ANOTHER process touches memory, among other things. ( I do run DBs in machines w/ overcommit disabled, this prevents it from happening, but it is not Pg who prevents it ). FOS.
Francisco Olarte <folarte@peoplecall.com> writes: > On Tue, 14 Feb 2023 at 11:29, PG Bug reporting form > <noreply@postgresql.org> wrote: >> It looks like the OS is killing the process due to running OOM, which is not >> very surprising when looking at the query. Is this expected, or should PG >> have guards in place to prevent this from happening? > When you run postgres in an environment where someone ( OOM killer ) > can K9 it, protection is hard. IIRC OOM can kill you because ANOTHER > process touches memory, among other things. Yeah. We have no visibility into what the OOM killer will choose to allow or not allow at any instant. > ( I do run DBs in machines w/ overcommit disabled, this prevents it > from happening, but it is not Pg who prevents it ). If overcommit-disabled doesn't seem practical, another idea that's been recommended is to start the postmaster under a "ulimit -v" setting chosen to cause plain ENOMEM failures before the OOM killer kicks in. regards, tom lane
On 2/14/23 09:47, Tom Lane wrote: > Francisco Olarte <folarte@peoplecall.com> writes: >> On Tue, 14 Feb 2023 at 11:29, PG Bug reporting form >> <noreply@postgresql.org> wrote: >>> It looks like the OS is killing the process due to running OOM, which is not >>> very surprising when looking at the query. Is this expected, or should PG >>> have guards in place to prevent this from happening? > >> When you run postgres in an environment where someone ( OOM killer ) >> can K9 it, protection is hard. IIRC OOM can kill you because ANOTHER >> process touches memory, among other things. > > Yeah. We have no visibility into what the OOM killer will choose to > allow or not allow at any instant. > >> ( I do run DBs in machines w/ overcommit disabled, this prevents it >> from happening, but it is not Pg who prevents it ). > > If overcommit-disabled doesn't seem practical, another idea that's > been recommended is to start the postmaster under a "ulimit -v" > setting chosen to cause plain ENOMEM failures before the OOM > killer kicks in. Given: > It looks like the OS is killing the process due to running OOM ... > I could reproduce it using a fresh docker container of the image > "postgres:13.10". I gather the OP is running Postgres in a container. If so, and if they have a memory.limit set (also assuming cgroup v1, otherwise memory.max for cgroup v2), disabling overcommit at the host level will not prevent an OOM kill when the cgroup exceeds memory.limit -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi Tom: On Tue, 14 Feb 2023 at 15:47, Tom Lane <tgl@sss.pgh.pa.us> wrote: > If overcommit-disabled doesn't seem practical, another idea that's > been recommended is to start the postmaster under a "ulimit -v" > setting chosen to cause plain ENOMEM failures before the OOM > killer kicks in. This may certainly help, but if I read the oomkiller stuff correctly you can have a case where Pg has its (big) share of memory allocated and commited, some other processes have overallocated and have uncommitted memory and when one of these tries to commit some mem the OOM kills Pg because it is the biggest memory user. Not an covercommti fan, so not too sure. FOS
Greetings, * Joe Conway (mail@joeconway.com) wrote: > On 2/14/23 09:47, Tom Lane wrote: > > Francisco Olarte <folarte@peoplecall.com> writes: > > > On Tue, 14 Feb 2023 at 11:29, PG Bug reporting form > > > <noreply@postgresql.org> wrote: > > > > It looks like the OS is killing the process due to running OOM, which is not > > > > very surprising when looking at the query. Is this expected, or should PG > > > > have guards in place to prevent this from happening? > > > > > When you run postgres in an environment where someone ( OOM killer ) > > > can K9 it, protection is hard. IIRC OOM can kill you because ANOTHER > > > process touches memory, among other things. > > > > Yeah. We have no visibility into what the OOM killer will choose to > > allow or not allow at any instant. > > > > > ( I do run DBs in machines w/ overcommit disabled, this prevents it > > > from happening, but it is not Pg who prevents it ). > > > > If overcommit-disabled doesn't seem practical, another idea that's > > been recommended is to start the postmaster under a "ulimit -v" > > setting chosen to cause plain ENOMEM failures before the OOM > > killer kicks in. Unfortunately, that ends up being on a per-process basis and therefore isn't as useful as we'd like it to be (ideally, we'd allow all of the backends to allocate up until we hit some global "only use X amount of memory" instead). > Given: > > It looks like the OS is killing the process due to running OOM ... > > I could reproduce it using a fresh docker container of the image > > "postgres:13.10". > > I gather the OP is running Postgres in a container. If so, and if they have > a memory.limit set (also assuming cgroup v1, otherwise memory.max for cgroup > v2), disabling overcommit at the host level will not prevent an OOM kill > when the cgroup exceeds memory.limit Indeed, we need a way to allow users to specify a limit which PG keeps track of and prevents PG from hitting the limit. There's work being done to both allow users to see how much memory each backend is using as well as tracking an overall allocation amount in shared memory across all processes, to allow us to decide to fail an allocation in a given backend instead of having the kernel decide to run the OOM killer. Further, once we're managing this, we can also make choices such as "keep X amount reserved for system processes like autovacuum" and "don't fail allocations for system processes like autovacuum" or such (just an example, not saying we necessarily want to do exactly that, but the point is that it's definitely nicer when we have control instead of getting hit with a memory allocation failure or worse the OOM killer deciding to kill -9 PG). Thanks, Stephen