BUG #17793: Query with large number of joins crashes PostgreSQL - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #17793: Query with large number of joins crashes PostgreSQL |
Date | |
Msg-id | 17793-21fa598adbc21118@postgresql.org Whole thread Raw |
Responses |
Re: BUG #17793: Query with large number of joins crashes PostgreSQL
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17793 Logged by: Tom Email address: tom@quist.de PostgreSQL version: 13.10 Operating system: Debian 10.2.1-6 (Docker image postgres:13.10) Description: Hey, we had a unit test that produced a similar query to the following one and caused Postgres to crash and reinitialize with automatic recovery: https://gist.github.com/tomquist/a58b62d1a15ce99a67657ed636129e9b These are the client logs: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. These are the server logs when the crash happens: 2023-02-14 09:17:31.440 UTC [1] LOG: server process (PID 35) was terminated by signal 9: Killed 2023-02-14 09:17:31.440 UTC [1] DETAIL: Failed process was running: with app_user as (select 'a' as id, 'organizationId' as organization_id, 'Africa/Abidjan' as timezone), employee as (select 'a' as id, 'organizationId' as organization_id, 'a' as user_id, 'a' as manager_id) SELECT DISTINCT app_user.id FROM app_user LEFT OUTER JOIN employee AS employee_1 ON employee_1.user_id = app_user.id LEFT OUTER JOIN employee AS manager_employee_1 ON (employee_1.manager_id = manager_employee_1.id AND manager_employee_1.user_id IN (SELECT app_user.id FROM app_user WHERE (app_user.timezone = 'Africa/Abidjan' AND app_user.organization_id = 'organizationId'))) LEFT OUTER JOIN employee AS employee_2 ON employee_2.user_id = app_user.id LEFT OUTER JOIN employee AS manager_employee_2 ON (employee_2.manager_id = manager_employee_2.id AND manager_employee_2.user_id IN (SELECT app_user.id FROM app_user WHERE (app_user.timezone = 'Africa/Algiers' AND app_user.organization_id = 'organizationId'))) LEFT OUTER JOIN employee AS employee_3 ON employee_3.user_id = app_user.id LEFT OUTER JOIN employee 2023-02-14 09:17:31.442 UTC [1] LOG: terminating any other active server processes 2023-02-14 09:17:31.446 UTC [32] WARNING: terminating connection because of crash of another server process 2023-02-14 09:17:31.446 UTC [32] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2023-02-14 09:17:31.446 UTC [32] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2023-02-14 09:17:31.461 UTC [37] FATAL: the database system is in recovery mode 2023-02-14 09:17:31.522 UTC [1] LOG: all server processes terminated; reinitializing 2023-02-14 09:17:31.546 UTC [38] LOG: database system was interrupted; last known up at 2023-02-14 09:16:05 UTC 2023-02-14 09:17:31.779 UTC [38] LOG: database system was not properly shut down; automatic recovery in progress 2023-02-14 09:17:31.781 UTC [38] LOG: redo starts at 0/15D7070 2023-02-14 09:17:31.782 UTC [38] LOG: invalid record length at 0/15D70A8: wanted 24, got 0 2023-02-14 09:17:31.782 UTC [38] LOG: redo done at 0/15D7070 2023-02-14 09:17:31.791 UTC [1] LOG: database system is ready to accept connections It looks like the OS is killing the process due to running OOM, which is not very surprising when looking at the query. Is this expected, or should PG have guards in place to prevent this from happening? I could reproduce it using a fresh docker container of the image "postgres:13.10". SELECT version(); > PostgreSQL 13.10 (Debian 13.10-1.pgdg110+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
pgsql-bugs by date: