Thread: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause

Recovery mode with partitioned tables and ANY(NULL) in WHERE clause

From
Piotr Włodarczyk
Date:
Hello everyone,

We're going to introduce in production environment solution based on table partitioning.

I think we found BUG in version from 11.4 to 11.5 (maybe and previous).

We have one big table, let's say "book". To improve manage and maintenance we've decided to partition the table in two partitions levels. First based on ranges of one column and level on another one. After release our changes on develop stage we've met unexpected behaviour: database randomly went into recovery mode. After a little investigation we found query that cause database crash. We've noticed before that in good practices in partitioning was point saying about limit for one hundred of partitions. We have +/- 400 so we thought it can be problem. For example with memory or something like that. We've tried to "explain" problematic query to show planner but explain fails too (without analyze). Finally we've discovered that problem occures likewise when we operate on empty tables with two ranged partitions and one default on one level (probably error occurs too on smaller count of partitions). Below reproduction path:

PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit, or
PostgreSQL 11.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit, or
PostgreSQL 11.5, compiled by Visual C++ build 1914, 64-bit.

Table create:
CREATE TABLE book (
    id SERIAL,
    id_publishing_house BIGINT,
    date_of_publish DATE
) PARTITION BY RANGE (id_publishing_house);

-- First level
CREATE TABLE book_default PARTITION OF book DEFAULT;
CREATE TABLE book_10000000 PARTITION OF book FOR VALUES FROM (1) TO (10000001);
CREATE TABLE book_20000000 PARTITION OF book FOR VALUES FROM (10000001) TO (20000001);

QUERY:
SELECT * FROM book WHERE id_publishing_house = ANY(NULL::BIGINT[]);

This query is crashing database. What is more interesting:

SELECT * FROM book WHERE id_publishing_house IN (NULL);

works fine.

Logs:

2019-08-09 18:02:29.560 CEST [13500] LOG:  database system was shut down at 2019-08-09 18:02:20 CEST
2019-08-09 18:02:29.591 CEST [18256] LOG:  database system is ready to accept connections
2019-08-09 18:02:53.566 CEST [18256] LOG:  server process (PID 2076) was terminated by exception 0xC0000005
2019-08-09 18:02:53.566 CEST [18256] DETAIL:  Failed process was running: SELECT * FROM book WHERE id_publishing_house = ANY(NULL::BIGINT[]);
2019-08-09 18:02:53.566 CEST [18256] HINT:  See C include file "ntstatus.h" for a description of the hexadecimal value.
2019-08-09 18:02:53.566 CEST [18256] LOG:  terminating any other active server processes
2019-08-09 18:02:53.571 CEST [16400] WARNING:  terminating connection because of crash of another server process
2019-08-09 18:02:53.571 CEST [16400] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2019-08-09 18:02:53.571 CEST [16400] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2019-08-09 18:02:53.594 CEST [4980] WARNING:  terminating connection because of crash of another server process
2019-08-09 18:02:53.594 CEST [4980] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2019-08-09 18:02:53.594 CEST [4980] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2019-08-09 18:02:53.634 CEST [18256] LOG:  all server processes terminated; reinitializing
2019-08-09 18:02:53.756 CEST [14972] FATAL:  the database system is in recovery mode
2019-08-09 18:02:53.769 CEST [13816] LOG:  database system was interrupted; last known up at 2019-08-09 18:02:29 CEST
2019-08-09 18:02:53.778 CEST [15388] FATAL:  the database system is in recovery mode
2019-08-09 18:02:54.640 CEST [12560] FATAL:  the database system is in recovery mode
2019-08-09 18:02:54.904 CEST [18008] FATAL:  the database system is in recovery mode
2019-08-09 18:02:55.637 CEST [16120] FATAL:  the database system is in recovery mode
2019-08-09 18:02:55.965 CEST [13816] LOG:  database system was not properly shut down; automatic recovery in progress
2019-08-09 18:02:55.972 CEST [13816] LOG:  redo starts at 0/168E160
2019-08-09 18:02:55.973 CEST [13816] LOG:  redo done at 0/168E160
2019-08-09 18:02:56.037 CEST [18256] LOG:  database system is ready to accept connections


--

Pozdrawiam
Piotr Włodarczyk

Re: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause

From
Tom Lane
Date:
=?UTF-8?Q?Piotr_W=C5=82odarczyk?= <piotrwlodarczyk89@gmail.com> writes:
> I think we found BUG in version from 11.4 to 11.5 (maybe and previous).

Yeah, looks like the ScalarArrayOpExpr path in
match_clause_to_partition_key forgot to consider the possibility
of a constant-null array.  Will fix, thanks for the report!

            regards, tom lane



Re: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause

From
Piotr Włodarczyk
Date:
Thanks for replay Tom,

Can you tell me in which versions it'll be fixed? For me it's critical situation because we can have some potentially dangerous places in application. 

On Fri, Aug 9, 2019 at 6:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Piotr Włodarczyk <piotrwlodarczyk89@gmail.com> writes:
> I think we found BUG in version from 11.4 to 11.5 (maybe and previous).

Yeah, looks like the ScalarArrayOpExpr path in
match_clause_to_partition_key forgot to consider the possibility
of a constant-null array.  Will fix, thanks for the report!

                        regards, tom lane


--

Pozdrawiam
Piotr Włodarczyk

Re: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause

From
Tom Lane
Date:
=?UTF-8?Q?Piotr_W=C5=82odarczyk?= <piotrwlodarczyk89@gmail.com> writes:
> Can you tell me in which versions it'll be fixed? For me it's critical
> situation because we can have some potentially dangerous places in
> application.

Unfortunately this just missed this week's releases, so it'll be the
next quarterly releases in November.

If you're sufficiently worried about it you could apply the patch locally.
It's pretty trivial:

https://git.postgresql.org/pg/commitdiff/2f729d83226705d1149419a2aef7c1678fe641ec

            regards, tom lane



Re: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause

From
Piotr Włodarczyk
Date:
Cool, thanks! 

pt., 9 sie 2019, 19:21 użytkownik Tom Lane <tgl@sss.pgh.pa.us> napisał:
Piotr Włodarczyk <piotrwlodarczyk89@gmail.com> writes:
> Can you tell me in which versions it'll be fixed? For me it's critical
> situation because we can have some potentially dangerous places in
> application.

Unfortunately this just missed this week's releases, so it'll be the
next quarterly releases in November.

If you're sufficiently worried about it you could apply the patch locally.
It's pretty trivial:

https://git.postgresql.org/pg/commitdiff/2f729d83226705d1149419a2aef7c1678fe641ec

                        regards, tom lane

Re: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause

From
Piotr Włodarczyk
Date:
Hi,

I've checked your commit and let me ensue it's fulfill all cases. For example what with query like that: select * from coercepart where a = any(null); ? there is no casting to array of specified type. I'm not master in C but looking in source I believe that "if (array->const snull)" is enough. But for better check you can add that query (and similar) to tests/regress



On Fri, Aug 9, 2019 at 7:23 PM Piotr Włodarczyk <piotrwlodarczyk89@gmail.com> wrote:
Cool, thanks! 

pt., 9 sie 2019, 19:21 użytkownik Tom Lane <tgl@sss.pgh.pa.us> napisał:
Piotr Włodarczyk <piotrwlodarczyk89@gmail.com> writes:
> Can you tell me in which versions it'll be fixed? For me it's critical
> situation because we can have some potentially dangerous places in
> application.

Unfortunately this just missed this week's releases, so it'll be the
next quarterly releases in November.

If you're sufficiently worried about it you could apply the patch locally.
It's pretty trivial:

https://git.postgresql.org/pg/commitdiff/2f729d83226705d1149419a2aef7c1678fe641ec

                        regards, tom lane


--

Pozdrawiam
Piotr Włodarczyk

Re: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause

From
Piotr Włodarczyk
Date:
Hi,

In V12 this bug exists too. Is your commit going to be part of V12? 

pt., 9 sie 2019, 19:44 użytkownik Piotr Włodarczyk <piotrwlodarczyk89@gmail.com> napisał:
Hi,

I've checked your commit and let me ensue it's fulfill all cases. For example what with query like that: select * from coercepart where a = any(null); ? there is no casting to array of specified type. I'm not master in C but looking in source I believe that "if (array->const snull)" is enough. But for better check you can add that query (and similar) to tests/regress



On Fri, Aug 9, 2019 at 7:23 PM Piotr Włodarczyk <piotrwlodarczyk89@gmail.com> wrote:
Cool, thanks! 

pt., 9 sie 2019, 19:21 użytkownik Tom Lane <tgl@sss.pgh.pa.us> napisał:
Piotr Włodarczyk <piotrwlodarczyk89@gmail.com> writes:
> Can you tell me in which versions it'll be fixed? For me it's critical
> situation because we can have some potentially dangerous places in
> application.

Unfortunately this just missed this week's releases, so it'll be the
next quarterly releases in November.

If you're sufficiently worried about it you could apply the patch locally.
It's pretty trivial:

https://git.postgresql.org/pg/commitdiff/2f729d83226705d1149419a2aef7c1678fe641ec

                        regards, tom lane


--

Pozdrawiam
Piotr Włodarczyk

Re: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause

From
David Rowley
Date:
On Tue, 13 Aug 2019 at 22:51, Piotr Włodarczyk
<piotrwlodarczyk89@gmail.com> wrote:
> In V12 this bug exists too. Is your commit going to be part of V12?

The fix Tom pushed is in master, v12 and v11.  v12's commit is
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59d3789294cf6d42325e92486b053f1ee5934eb8

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services