Recovery mode with partitioned tables and ANY(NULL) in WHERE clause - Mailing list pgsql-bugs

From Piotr Włodarczyk
Subject Recovery mode with partitioned tables and ANY(NULL) in WHERE clause
Date
Msg-id CAP-dhMr+vRpwizEYjUjsiZ1vwqpohTm+3Pbdt6Pr7FEgPq9R0Q@mail.gmail.com
Whole thread Raw
Responses Re: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Hello everyone,

We're going to introduce in production environment solution based on table partitioning.

I think we found BUG in version from 11.4 to 11.5 (maybe and previous).

We have one big table, let's say "book". To improve manage and maintenance we've decided to partition the table in two partitions levels. First based on ranges of one column and level on another one. After release our changes on develop stage we've met unexpected behaviour: database randomly went into recovery mode. After a little investigation we found query that cause database crash. We've noticed before that in good practices in partitioning was point saying about limit for one hundred of partitions. We have +/- 400 so we thought it can be problem. For example with memory or something like that. We've tried to "explain" problematic query to show planner but explain fails too (without analyze). Finally we've discovered that problem occures likewise when we operate on empty tables with two ranged partitions and one default on one level (probably error occurs too on smaller count of partitions). Below reproduction path:

PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit, or
PostgreSQL 11.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit, or
PostgreSQL 11.5, compiled by Visual C++ build 1914, 64-bit.

Table create:
CREATE TABLE book (
    id SERIAL,
    id_publishing_house BIGINT,
    date_of_publish DATE
) PARTITION BY RANGE (id_publishing_house);

-- First level
CREATE TABLE book_default PARTITION OF book DEFAULT;
CREATE TABLE book_10000000 PARTITION OF book FOR VALUES FROM (1) TO (10000001);
CREATE TABLE book_20000000 PARTITION OF book FOR VALUES FROM (10000001) TO (20000001);

QUERY:
SELECT * FROM book WHERE id_publishing_house = ANY(NULL::BIGINT[]);

This query is crashing database. What is more interesting:

SELECT * FROM book WHERE id_publishing_house IN (NULL);

works fine.

Logs:

2019-08-09 18:02:29.560 CEST [13500] LOG:  database system was shut down at 2019-08-09 18:02:20 CEST
2019-08-09 18:02:29.591 CEST [18256] LOG:  database system is ready to accept connections
2019-08-09 18:02:53.566 CEST [18256] LOG:  server process (PID 2076) was terminated by exception 0xC0000005
2019-08-09 18:02:53.566 CEST [18256] DETAIL:  Failed process was running: SELECT * FROM book WHERE id_publishing_house = ANY(NULL::BIGINT[]);
2019-08-09 18:02:53.566 CEST [18256] HINT:  See C include file "ntstatus.h" for a description of the hexadecimal value.
2019-08-09 18:02:53.566 CEST [18256] LOG:  terminating any other active server processes
2019-08-09 18:02:53.571 CEST [16400] WARNING:  terminating connection because of crash of another server process
2019-08-09 18:02:53.571 CEST [16400] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2019-08-09 18:02:53.571 CEST [16400] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2019-08-09 18:02:53.594 CEST [4980] WARNING:  terminating connection because of crash of another server process
2019-08-09 18:02:53.594 CEST [4980] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2019-08-09 18:02:53.594 CEST [4980] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2019-08-09 18:02:53.634 CEST [18256] LOG:  all server processes terminated; reinitializing
2019-08-09 18:02:53.756 CEST [14972] FATAL:  the database system is in recovery mode
2019-08-09 18:02:53.769 CEST [13816] LOG:  database system was interrupted; last known up at 2019-08-09 18:02:29 CEST
2019-08-09 18:02:53.778 CEST [15388] FATAL:  the database system is in recovery mode
2019-08-09 18:02:54.640 CEST [12560] FATAL:  the database system is in recovery mode
2019-08-09 18:02:54.904 CEST [18008] FATAL:  the database system is in recovery mode
2019-08-09 18:02:55.637 CEST [16120] FATAL:  the database system is in recovery mode
2019-08-09 18:02:55.965 CEST [13816] LOG:  database system was not properly shut down; automatic recovery in progress
2019-08-09 18:02:55.972 CEST [13816] LOG:  redo starts at 0/168E160
2019-08-09 18:02:55.973 CEST [13816] LOG:  redo done at 0/168E160
2019-08-09 18:02:56.037 CEST [18256] LOG:  database system is ready to accept connections


--

Pozdrawiam
Piotr Włodarczyk

pgsql-bugs by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: BUG #15940: json_populate_recordset fails with ERROR: record typehas not been registered
Next
From: Tom Lane
Date:
Subject: Re: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause