Re: partition by range or by list constraint check (was Re: literal vs dynamic partition constraint in plan execution) - Mailing list pgsql-general

From Tom Lane
Subject Re: partition by range or by list constraint check (was Re: literal vs dynamic partition constraint in plan execution)
Date
Msg-id 26944.1567438342@sss.pgh.pa.us
Whole thread Raw
In response to partition by range or by list constraint check (was Re: literal vsdynamic partition constraint in plan execution)  (Luca Ferrari <fluca1978@gmail.com>)
List pgsql-general
Luca Ferrari <fluca1978@gmail.com> writes:
> I've done a simple test case, and find out that probably the problem I
> got was due to the partition schema I'm using.
> I want a table to be partitioned by a timestamp field with a first
> level partition by year, and a second level by month. Therefore, I did
> a BY LIST partitioning,

> CREATE TABLE root( pk int generated always as identity, v int, ts
> timestamp default current_timestamp )
> PARTITION BY LIST( extract( year from ts ) );

No, that's not going to work, unless your queries all explicitly use
"extract( year from ts ) = something" in their WHERE clauses.  There
is nothing in the system that would derive a constraint like that
from a constraint that just mentions ts.

In your example, the partition routing logic was entirely ineffective
because of this, so you tried to make up for that by adding
CHECK constraints.  But pruning based on CHECK constraints is done
at planning time, so it can't do anything with run-time-mutable
values such as CURRENT_TIMESTAMP.

> CREATE TABLE root( pk int generated always as identity, v int, ts
> timestamp default current_timestamp )
> PARTITION BY RANGE( ts );

> CREATE TABLE y2018
> PARTITION OF root
> FOR VALUES FROM ('2018-01-01 00:00:00.000000')
> TO ('2018-12-31 23:59:59.000000');

This is mostly the right way to do it; you forgot that range bounds
use "low <= variable < high" logic.  So the correct way to declare
the partitions is like

regression=# CREATE TABLE y2018
regression-# PARTITION OF root
regression-# FOR VALUES FROM ('2018-01-01') TO ('2019-01-01');
CREATE TABLE
regression=# CREATE TABLE y2019
PARTITION OF root
FOR VALUES FROM ('2019-01-01') TO ('2020-01-01');
CREATE TABLE

That's easier and doesn't leave a gap between partitions.

This is already sufficient to allow partition routing based on
equality or simple inequality involving "ts", so you don't need
extra CHECK constraints.

> I still don't get why using a literal in the first case can lead to a
> "more correct" plan.

With a literal, plan-time pruning based on the CHECK constraints
was possible.

> And I'm curious to know if there's a way to force constraints in the
> list partitioning to make the planner really aware of tables that can
> be excluded.

No.  The short answer here is that your query WHERE clauses have to
be things that the planner or partition routing code can relate to
the partitioning rules.  In the case of LIST partitioning, that means
there had better be WHERE constraints on the values specified in
the LIST clause, not values that perhaps could be shown to be related
to those values given extensive knowledge about the behaviors of
certain functions.  By and large, the system doesn't have such
knowledge.

            regards, tom lane



pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Upgrade 96 -> 11
Next
From: stan
Date:
Subject: Posible off topic ? pgmodeler