BUG #16754: When using LLVM and parallel queries aborted all session by pg_cancel_backend. - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #16754: When using LLVM and parallel queries aborted all session by pg_cancel_backend. |
Date | |
Msg-id | 16754-4a800d382f5cfae9@postgresql.org Whole thread Raw |
Responses |
Re: BUG #16754: When using LLVM and parallel queries aborted all session by pg_cancel_backend.
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 16754 Logged by: Noriyoshi Shinoda Email address: noriyoshi.shinoda@hpe.com PostgreSQL version: 13.1 Operating system: Red Hat Enterprise Linux 7.8 (x86-64) Description: OS: Red Hat Enterprise Linux 7.8 (x86-64) Version: PostgreSQL 13.1 cfe-7.1.0 llvm-7.1.0 cmake-3.18.4 Configure: configure --prefix=$HOME/pg131 --with-llvm --enable-debug --enable-cassert $ pg_config BINDIR = /home/postgres/pg131/bin DOCDIR = /home/postgres/pg131/share/doc HTMLDIR = /home/postgres/pg131/share/doc INCLUDEDIR = /home/postgres/pg131/include PKGINCLUDEDIR = /home/postgres/pg131/include INCLUDEDIR-SERVER = /home/postgres/pg131/include/server LIBDIR = /home/postgres/pg131/lib PKGLIBDIR = /home/postgres/pg131/lib LOCALEDIR = /home/postgres/pg131/share/locale MANDIR = /home/postgres/pg131/share/man SHAREDIR = /home/postgres/pg131/share SYSCONFDIR = /home/postgres/pg131/etc PGXS = /home/postgres/pg131/lib/pgxs/src/makefiles/pgxs.mk CONFIGURE = '--prefix=/home/postgres/pg131' '--with-llvm' '--enable-debug' '--enable-cassert' CC = gcc -std=gnu99 CPPFLAGS = -D_GNU_SOURCE CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 CFLAGS_SL = -fPIC LDFLAGS = -L/usr/local/lib -Wl,--as-needed -Wl,-rpath,'/home/postgres/pg131/lib',--enable-new-dtags LDFLAGS_EX = LDFLAGS_SL = LIBS = -lpgcommon -lpgport -lpthread -lz -lreadline -lrt -ldl -lm VERSION = PostgreSQL 13.1 - Prepare $ initdb --no-locale --encoding=utf8 data.131 $ pg_ctl -D data.131 start $ psql postgres=# SELECT version(); version --------------------------------------------------------------------------------------------------------- PostgreSQL 13.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit (1 row) postgres=# CREATE USER demo PASSWORD 'demo'; CREATE ROLE postgres=# \connect postgres demo You are now connected to database "postgres" as user "demo". postgres=> CREATE TABLE data1(c1 NUMERIC, c2 VARCHAR(10)); CREATE TABLE postgres=> INSERT INTO data1 VALUES (generate_series(1, 1000000), 'data1'); INSERT 0 1000000 postgres=> ANALYZE data1; ANALYZE postgres=> \d+ List of relations Schema | Name | Type | Owner | Persistence | Size | Description --------+-------+-------+-------+-------------+-------+------------- public | data1 | table | demo | permanent | 42 MB | (1 row) - Operation -- Normal case (No JIT/Cancel parallel worker) --- Session#1 $ psql postgres demo postgres=> SET force_parallel_mode = on; SET postgres=> SET jit = off; SET postgres=> EXPLAIN SELECT * FROM data1 d1, data1 d2; QUERY PLAN ------------------------------------------------------------------------------------- Gather (cost=1000.00..117383034312.00 rows=1000000000000 width=24) Workers Planned: 1 Single Copy: true -> Nested Loop (cost=0.00..17383033312.00 rows=1000000000000 width=24) -> Seq Scan on data1 d1 (cost=0.00..15406.00 rows=1000000 width=12) -> Materialize (cost=0.00..25289.00 rows=1000000 width=12) -> Seq Scan on data1 d2 (cost=0.00..15406.00 rows=1000000 width=12) (7 rows) postgres=> EXPLAIN ANALYZE SELECT * FROM data1 d1, data1 d2; <- Execute pg_cancel_backend(130457) on Session#2 ERROR: canceling statement due to user request CONTEXT: parallel worker --- Session#2 $ psql postgres=# SELECT pid, leader_pid FROM pg_stat_activity WHERE usename='demo'; pid | leader_pid --------+------------ 130413 | 130457 | 130413 (2 rows) postgres=# SELECT pg_cancel_backend(130457); pg_cancel_backend ------------------- t (1 row) -- NG case (With JIT / Cancel parallel worker) --- Session#1 $ psql postgres demo postgres=> SET force_parallel_mode = on; SET postgres=> SET jit = on; SET postgres=> EXPLAIN SELECT * FROM data1 d1, data1 d2; QUERY PLAN ------------------------------------------------------------------------------------- Gather (cost=1000.00..117383034312.00 rows=1000000000000 width=24) Workers Planned: 1 Single Copy: true -> Nested Loop (cost=0.00..17383033312.00 rows=1000000000000 width=24) -> Seq Scan on data1 d1 (cost=0.00..15406.00 rows=1000000 width=12) -> Materialize (cost=0.00..25289.00 rows=1000000 width=12) -> Seq Scan on data1 d2 (cost=0.00..15406.00 rows=1000000 width=12) JIT: Functions: 3 Options: Inlining true, Optimization true, Expressions true, Deforming true (10 rows) postgres=> EXPLAIN ANALYZE SELECT * FROM data1 d1, data1 d2; <- Execute pg_cancel_backend(130342); WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. ERROR: canceling statement due to user request CONTEXT: parallel worker server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !?> --- Session#2 postgres=# SELECT pid, leader_pid FROM pg_stat_activity WHERE usename='demo'; pid | leader_pid --------+------------ 130322 | 130342 | 130322 (2 rows) postgres=# SELECT pg_cancel_backend(130342); pg_cancel_backend ------------------- t (1 row) - log file 2020-12-01 14:26:02.337 JST [130270] LOG: background worker "parallel worker" (PID 130342) was terminated by signal 11: Segmentation fault 2020-12-01 14:26:02.337 JST [130270] DETAIL: Failed process was running: EXPLAIN ANALYZE SELECT * FROM data1 d1, data1 d2; 2020-12-01 14:26:02.337 JST [130270] LOG: terminating any other active server processes 2020-12-01 14:26:02.337 JST [130322] WARNING: terminating connection because of crash of another server process 2020-12-01 14:26:02.337 JST [130322] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-12-01 14:26:02.337 JST [130322] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-12-01 14:26:02.338 JST [130276] WARNING: terminating connection because of crash of another server process 2020-12-01 14:26:02.338 JST [130276] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-12-01 14:26:02.338 JST [130276] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-12-01 14:26:02.340 JST [130345] FATAL: the database system is in recovery mode 2020-12-01 14:26:02.343 JST [130319] WARNING: terminating connection because of crash of another server process 2020-12-01 14:26:02.343 JST [130319] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-12-01 14:26:02.343 JST [130319] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-12-01 14:26:02.343 JST [130270] LOG: all server processes terminated; reinitializing 2020-12-01 14:26:02.355 JST [130348] LOG: database system was interrupted; last known up at 2020-12-01 14:21:07 JST 2020-12-01 14:26:02.360 JST [130348] LOG: database system was not properly shut down; automatic recovery in progress 2020-12-01 14:26:02.360 JST [130348] LOG: redo starts at 0/15C07D0 2020-12-01 14:26:02.767 JST [130348] LOG: invalid record length at 0/5B10A90: wanted 24, got 0 2020-12-01 14:26:02.767 JST [130348] LOG: redo done at 0/5B10A58 2020-12-01 14:26:02.821 JST [130270] LOG: database system is ready to accept connections
pgsql-bugs by date: