BUG #16260: Repetitive accessing to partitioned table inside transaction causes server process crash - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #16260: Repetitive accessing to partitioned table inside transaction causes server process crash
Date
Msg-id 16260-ceaef56a5acbf449@postgresql.org
Whole thread Raw
Responses Re: BUG #16260: Repetitive accessing to partitioned table inside transaction causes server process crash  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      16260
Logged by:          Michael
Email address:      mingjuwu0505@gmail.com
PostgreSQL version: 12.2
Operating system:   Mint Linux 18.3
Description:

Hello,

Accessing to a partitioned table with large partitions (20K) inside a
transaction can cause postgres to use large amount of memory (>30GB) and
ultimately leading to server process crash. The system that I am using has
48GB physical memory and 72GB swap.

Setup and stress testings steps are attached below. With default
configuration of "max_locks_per_transaction", it will show:

ERROR:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.

After increasing "max_locks_per_transaction" to 6400, while stress() is
running, the postgres process will continue to consume more memory.
Ultimately, in pgAdmin 4, console will show "Connection to the server has
been lost." The following messages can be found in postgres log:

2020-02-16 17:01:35.449 CST [22652] LOG:  server process (PID 23149) was
terminated by signal 9: Killed
2020-02-16 17:01:35.449 CST [22652] DETAIL:  Failed process was running:
call stress();
2020-02-16 17:01:35.449 CST [22652] LOG:  terminating any other active
server processes
2020-02-16 17:01:35.458 CST [22659] WARNING:  terminating connection because
of crash of another server process
2020-02-16 17:01:35.458 CST [22659] DETAIL:  The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2020-02-16 17:01:35.458 CST [22659] HINT:  In a moment you should be able to
reconnect to the database and repeat your command.
2020-02-16 17:01:35.487 CST [22664] postgres@17ptt_test_db WARNING:
terminating connection because of crash of another server process
2020-02-16 17:01:35.487 CST [22664] postgres@17ptt_test_db DETAIL:  The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally and
possibly corrupted shared memory.
2020-02-16 17:01:35.487 CST [22664] postgres@17ptt_test_db HINT:  In a
moment you should be able to reconnect to the database and repeat your
command.
2020-02-16 17:01:35.491 CST [22652] LOG:  all server processes terminated;
reinitializing
2020-02-16 17:01:35.652 CST [4757] LOG:  database system was interrupted;
last known up at 2020-02-16 16:56:26 CST
2020-02-16 17:01:35.939 CST [4758] postgres@17ptt_test_db FATAL:  the
database system is in recovery mode
2020-02-16 17:01:35.940 CST [4759] postgres@17ptt_test_db FATAL:  the
database system is in recovery mode
2020-02-16 17:01:38.622 CST [4757] LOG:  database system was not properly
shut down; automatic recovery in progress
2020-02-16 17:01:38.630 CST [4757] LOG:  redo starts at 9/D5353028
2020-02-16 17:01:38.630 CST [4757] LOG:  invalid record length at
9/D5353110: wanted 24, got 0
2020-02-16 17:01:38.630 CST [4757] LOG:  redo done at 9/D53530D8
2020-02-16 17:01:38.684 CST [22652] LOG:  database system is ready to accept
connections

In the stress() procedure, replacing DELETE with UPDATE statement has same
effect.

Best Regards,
Michael


---------
-- SETUP
---------

CREATE TABLE test_table (
    id bigint NOT NULL,
    category bigint NOT NULL
) PARTITION BY LIST (category);

CREATE OR REPLACE PROCEDURE create_test_table_paritions()
AS $$
DECLARE
    counter INTEGER := 1; 
BEGIN
    WHILE counter <= 20000 LOOP
        EXECUTE 'CREATE TABLE test_table_' || counter || ' PARTITION OF test_table
FOR VALUES IN (' || counter || ')';
        counter := counter + 1;
    END LOOP;
END
$$
LANGUAGE plpgsql;

call create_test_table_paritions();

--------------
-- STRESS TEST
--------------

CREATE OR REPLACE PROCEDURE stress()
AS $$
DECLARE
    counter INTEGER := 1; 
BEGIN
    WHILE counter <= 20000 LOOP
        DELETE FROM test_table WHERE category = counter;
        raise notice '%', counter;
        counter := counter + 1;
    END LOOP;
END
$$
LANGUAGE plpgsql;

call stress();


pgsql-bugs by date:

Previous
From: Heath Lord
Date:
Subject: Re: BUG #16259: Cannot Use "pg_ctl start -l logfile" on Clean Installon Windows Server 2012/2016
Next
From: Tom Lane
Date:
Subject: Re: BUG #16259: Cannot Use "pg_ctl start -l logfile" on Clean Install on Windows Server 2012/2016