Proposal: Automatic partition creation - Mailing list pgsql-hackers

From Anastasia Lubennikova
Subject Proposal: Automatic partition creation
Date
Msg-id 7fec3abb-c663-c0d2-8452-a46141be6d4a@postgrespro.ru
Whole thread Raw
Responses Re: Proposal: Automatic partition creation  (Justin Pryzby <pryzby@telsasoft.com>)
Re: Proposal: Automatic partition creation  (Robert Haas <robertmhaas@gmail.com>)
Re: Proposal: Automatic partition creation  (Fabien COELHO <coelho@cri.ensmp.fr>)
Re: Proposal: Automatic partition creation  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
List pgsql-hackers
The previous discussion of automatic partition creation [1] has 
addressed static and dynamic creation of partitions and ended up with 
several syntax proposals.
In this thread, I want to continue this work.

Attached is PoC for static partition creation. The patch core is quite 
straightforward. It adds one more transform clause to convert given 
partitioning specification into several CREATE TABLE statements.

The patch implements following syntax:

CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
partition_auto_create_clause

where partition_auto_create_clause is

CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec

and partition_bound_spec is:

MODULUS integer | VALUES IN (expr [,...]) [, ....] |  INTERVAL 
range_step FROM range_start TO range_end

For more examples check auto_partitions.sql in the patch.

TODO:

- CONFIGURATION is just an existing keyword, that I picked as a stub.
  Ideas on better wording are welcome.

- IMMEDIATE| DEFERRED is optional, DEFERRED is not implemented yet
I wonder, is it worth placing a stub for dynamic partitioning, or we can 
rather add these keywords later.

- HASH and LIST static partitioning works as expected.
Testing and feedback are welcome.

- RANGE partitioning is not really implemented in this patch.
Now it only accepts interval data type as 'interval' and respectively 
date types as range_start and range_end expressions.
Only one partition is created. I found it difficult to implement the 
generation of bounds using internal functions and data types.
Both existing solutions (pg_pathman and pg_partman) rely on SQL level 
routines [2].
I am going to implement this via SPI, which allow to simplify checks and 
calculations. Do you see any pitfalls in this approach?

- Partition naming. Now partition names for all methods look like 
$tablename_$partnum
Do we want more intelligence here? Now we have 
RunObjectPostCreateHook(), which allows to rename the table.
To make it more user-friendly, we can later implement pl/pgsql function 
that sets the callback, as it is done in pg_pathman set_init_callback() [3].

- Current design doesn't allow to create default partition 
automatically. Do we need this functionality?

- Do you see any restrictions for future extensibility (dynamic 
partitioning, init_callback, etc.) in the proposed design ?

I expect this to be a long discussion, so here is the wiki page [4] to 
fix important questions and final agreements.

[1] 
https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1907150711080.22273%40lancre
[2] 
https://github.com/postgrespro/pg_pathman/blob/dbcbd02e411e6acea6d97f572234746007979538/range.sql#L99
[3] https://github.com/postgrespro/pg_pathman#additional-parameters
[4] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

pgsql-hackers by date:

Previous
From: yuzuko
Date:
Subject: Re: Autovacuum on partitioned table (autoanalyze)
Next
From: Amit Kapila
Date:
Subject: Re: Resetting spilled txn statistics in pg_stat_replication