Thread: Proposal: Automatic partition creation

Proposal: Automatic partition creation

From
Anastasia Lubennikova
Date:
The previous discussion of automatic partition creation [1] has 
addressed static and dynamic creation of partitions and ended up with 
several syntax proposals.
In this thread, I want to continue this work.

Attached is PoC for static partition creation. The patch core is quite 
straightforward. It adds one more transform clause to convert given 
partitioning specification into several CREATE TABLE statements.

The patch implements following syntax:

CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
partition_auto_create_clause

where partition_auto_create_clause is

CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec

and partition_bound_spec is:

MODULUS integer | VALUES IN (expr [,...]) [, ....] |  INTERVAL 
range_step FROM range_start TO range_end

For more examples check auto_partitions.sql in the patch.

TODO:

- CONFIGURATION is just an existing keyword, that I picked as a stub.
  Ideas on better wording are welcome.

- IMMEDIATE| DEFERRED is optional, DEFERRED is not implemented yet
I wonder, is it worth placing a stub for dynamic partitioning, or we can 
rather add these keywords later.

- HASH and LIST static partitioning works as expected.
Testing and feedback are welcome.

- RANGE partitioning is not really implemented in this patch.
Now it only accepts interval data type as 'interval' and respectively 
date types as range_start and range_end expressions.
Only one partition is created. I found it difficult to implement the 
generation of bounds using internal functions and data types.
Both existing solutions (pg_pathman and pg_partman) rely on SQL level 
routines [2].
I am going to implement this via SPI, which allow to simplify checks and 
calculations. Do you see any pitfalls in this approach?

- Partition naming. Now partition names for all methods look like 
$tablename_$partnum
Do we want more intelligence here? Now we have 
RunObjectPostCreateHook(), which allows to rename the table.
To make it more user-friendly, we can later implement pl/pgsql function 
that sets the callback, as it is done in pg_pathman set_init_callback() [3].

- Current design doesn't allow to create default partition 
automatically. Do we need this functionality?

- Do you see any restrictions for future extensibility (dynamic 
partitioning, init_callback, etc.) in the proposed design ?

I expect this to be a long discussion, so here is the wiki page [4] to 
fix important questions and final agreements.

[1] 
https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1907150711080.22273%40lancre
[2] 
https://github.com/postgrespro/pg_pathman/blob/dbcbd02e411e6acea6d97f572234746007979538/range.sql#L99
[3] https://github.com/postgrespro/pg_pathman#additional-parameters
[4] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: Proposal: Automatic partition creation

From
Justin Pryzby
Date:
On Mon, Jul 06, 2020 at 01:45:52PM +0300, Anastasia Lubennikova wrote:
> The previous discussion of automatic partition creation [1] has addressed
> static and dynamic creation of partitions and ended up with several syntax
> proposals.
...
> where partition_auto_create_clause is
> 
> CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec

> - IMMEDIATE| DEFERRED is optional, DEFERRED is not implemented yet
> I wonder, is it worth placing a stub for dynamic partitioning, or we can
> rather add these keywords later.

I understand by "deferred" you mean that the partition isn't created at the
time CREATE TABLE is run but rather deferred until needed by INSERT.

For deferred, range partitioned tables, I think maybe what you'd want to
specify (and store) is the INTERVAL.  If the table is partitioned by day, then
we'd date_trunc('day', time) and dynamically create that day.  But if it was
partitioned by month, we'd create the month.  I think you'd want to have an
ALTER command for that (we would use that to change tables between
daily/monthly based on their current size).  That should also support setting
the MODULUS of a HASH partitioned table, to allow changing the size of its
partitions (currently, the user would have to more or less recreate the table
and move all its data into different partitions, but that's not ideal).

I don't know if it's important for anyone, but it would be interesting to think
about supporting sub-partitioning: partitions which are themselvese partitioned.
Like something => something_YYYY => something_YYYY_MM => something_YYYY_MM_DD.
You'd need to specify how to partition each layer of the heirarchy.  In the
most general case, it could be different partition strategy.

If you have a callback function for partition renaming, I think you'd want to
pass it not just the current name of the partition, but also the "VALUES" used
in partition creation.  Like (2020-04-05)TO(2020-05-06).  Maybe instead, we'd
allow setting a "format" to use to construct the partition name.  Like
"child.foo_bar_%Y_%m_%d".  Ideally, the formats would be fixed-length
(zero-padded, etc), so failures with length can happen at "parse" time of the
statement and not at "run" time of the creation.  You'd still have to handle
the case that the name already exists but isn't a partition (or is a partition
by doesn't handle the incoming tuple for some reason).

Also, maybe your "configuration" syntax would allow specifying other values.
Maybe including a retention period (as an INTERVAL for RANGE tables).  That's
useful if you had a command to PRUNE the oldest partitions, like ALTER..PRUNE.

-- 
Justin



Re: Proposal: Automatic partition creation

From
Robert Haas
Date:
On Mon, Jul 6, 2020 at 6:46 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
> CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
> partition_auto_create_clause
>
> where partition_auto_create_clause is
>
> CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec
>
> and partition_bound_spec is:
>
> MODULUS integer | VALUES IN (expr [,...]) [, ....] |  INTERVAL
> range_step FROM range_start TO range_end

Might be good to compare this to what other databases support.

> - IMMEDIATE| DEFERRED is optional, DEFERRED is not implemented yet
> I wonder, is it worth placing a stub for dynamic partitioning, or we can
> rather add these keywords later.

I think we should not add any keywords we don't need immediately - and
should seek to minimize the number of new keywords that we need to
add, though compatibility with other implementations might be a good
reason for accepting some new ones.

> - HASH and LIST static partitioning works as expected.
> Testing and feedback are welcome.
>
> - RANGE partitioning is not really implemented in this patch.
> Now it only accepts interval data type as 'interval' and respectively
> date types as range_start and range_end expressions.
> Only one partition is created. I found it difficult to implement the
> generation of bounds using internal functions and data types.
> Both existing solutions (pg_pathman and pg_partman) rely on SQL level
> routines [2].
> I am going to implement this via SPI, which allow to simplify checks and
> calculations. Do you see any pitfalls in this approach?

I don't really see why we need SPI here. Why can't we just try to
evaluate the impression and see if we get a constant of the right
type, then use that?

I think the big problem here is identifying the operator to use. We
have no way of identifying the "plus" or "minus" operator associated
with a datatype; indeed, that constant doesn't exist. So either we (a)
limit this to a short list of data types and hard-code the operators
to be used (which is kind of sad given how extensible our type system
is) or we (b) invent some new mechanism for identifying the +/-
operators that should be used for a datatype, which was also proposed
in the context of some previous discussion of window framing options,
but which I don't think ever went anywhere (which is a lot of work) or
we (c) just look for operators called '+' and/or '-' by operator name
(which will probably make Tom throw up in his mouth a little).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Proposal: Automatic partition creation

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jul 6, 2020 at 6:46 AM Anastasia Lubennikova
> <a.lubennikova@postgrespro.ru> wrote:
>> I am going to implement this via SPI, which allow to simplify checks and
>> calculations. Do you see any pitfalls in this approach?

> I don't really see why we need SPI here.

I would vote against any core facility that is implemented via SPI
queries.  It is just too darn hard to control the semantics completely in
the face of fun stuff like varying search_path.  Look at what a mess the
queries generated by the RI triggers are --- and they only have a very
small set of behaviors to worry about.  I'm still only about 95% confident
they don't have security issues, too.

If you're using SPI to try to look up appropriate operators, I think
the chances of being vulnerable to security problems are 100%.

> I think the big problem here is identifying the operator to use. We
> have no way of identifying the "plus" or "minus" operator associated
> with a datatype; indeed, that constant doesn't exist.

We did indeed solve this in connection with window functions, cf
0a459cec9.  I may be misunderstanding what the problem is here,
but I think trying to reuse that infrastructure might help.

            regards, tom lane



Re: Proposal: Automatic partition creation

From
Robert Haas
Date:
On Mon, Jul 6, 2020 at 12:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> We did indeed solve this in connection with window functions, cf
> 0a459cec9.  I may be misunderstanding what the problem is here,
> but I think trying to reuse that infrastructure might help.

Ah, nice. I didn't realize that we'd added that. But I'm not sure that
it helps here, because I think we need to compute the end of the
range, not just test whether something is in a range. Like, if someone
wants monthly range partitions starting on 2020-01-01, we need to be
able to figure out that the subsequent months start on 2020-02-01,
2020-03-01, 2020-04-01, etc. Is there a way to use in_range to achieve
that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Proposal: Automatic partition creation

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jul 6, 2020 at 12:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We did indeed solve this in connection with window functions, cf
>> 0a459cec9.  I may be misunderstanding what the problem is here,
>> but I think trying to reuse that infrastructure might help.

> Ah, nice. I didn't realize that we'd added that. But I'm not sure that
> it helps here, because I think we need to compute the end of the
> range, not just test whether something is in a range.

Yeah, I was thinking about that later, and I agree that the in_range
support function doesn't quite do the job.  But we could expand on the
principle, and register addition (and subtraction?) functions as btree
support functions under the same rules as for in_range functions.

The reason in_range isn't just addition is that we wanted it to be able
to give correct answers even in cases where addition would overflow.
That's still valid for that use-case, but it doesn't apply here.

So it'd be something like "btree support function 4, registered under
amproclefttype x and amprocrighttype y, must have the signature
    plus(x, y) returns x
and it gives results compatible with the opfamily's ordering of type x".
Similarly for subtraction if we think we need that.

I'm not sure if we need a formal notion of what "compatible results"
means, but it probably would be something like "if x < z according to the
opfamily sort ordering, then plus(x, y) < plus(z, y) for any given y".
Now this falls to the ground when y is a weird value like Inf or NaN,
but we'd want to exclude those as partitioning values anyway.  Do we
also need some datatype-independent way of identifying such "weird
values"?

            regards, tom lane



Re: Proposal: Automatic partition creation

From
Fabien COELHO
Date:
Hello Anastasia,

My 0.02 €:

> The patch implements following syntax:
>
> CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
> partition_auto_create_clause
>
> where partition_auto_create_clause is
>
> CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec
>
> and partition_bound_spec is:
>
> MODULUS integer | VALUES IN (expr [,...]) [, ....] |  INTERVAL range_step 
> FROM range_start TO range_end

ISTM That we should avoid new specific syntaxes when possible, and prefer 
free keyword option style, like it is being discussed for some other 
commands, because it reduces the impact on the parser.

That would suggest a more versatile partition_bound_spec which could look 
like  (<keyword> <constant-or-maybe-even-expr>[, …]):

For modulus, looks easy:

   (MODULUS 8)

For interval, maybe something like:

   (STEP ..., FROM/START ..., TO/END ...)

The key point is that for dynamic partitioning there would be no need for 
boundaries, so that it could just set a point and an interval

   (START/INIT/FROM??? ..., STEP ...)

For lists of values, probably it would make little sense to have dynamic 
partitioning? Or maybe yes, if we could partition on a column 
value/expression?! eg "MOD(id, 8)"??

What about pg_dump? Should it be able to regenerate the initial create?

> [4] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements

Good point, a wiki is better than a thread for that type of things. I'll 
look at this page.

-- 
Fabien.

Re: Proposal: Automatic partition creation

From
Amul Sul
Date:
On Wed, Jul 8, 2020 at 10:24 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>
>
> Hello Anastasia,
>
> My 0.02 €:
>
> > The patch implements following syntax:
> >
> > CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
> > partition_auto_create_clause
> >
> > where partition_auto_create_clause is
> >
> > CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec
> >
> > and partition_bound_spec is:
> >
> > MODULUS integer | VALUES IN (expr [,...]) [, ....] |  INTERVAL range_step
> > FROM range_start TO range_end
>
> ISTM That we should avoid new specific syntaxes when possible, and prefer
> free keyword option style, like it is being discussed for some other
> commands, because it reduces the impact on the parser.
>
> That would suggest a more versatile partition_bound_spec which could look
> like  (<keyword> <constant-or-maybe-even-expr>[, …]):
>
> For modulus, looks easy:
>
>    (MODULUS 8)
>
> For interval, maybe something like:
>
>    (STEP ..., FROM/START ..., TO/END ...)
>
> The key point is that for dynamic partitioning there would be no need for
> boundaries, so that it could just set a point and an interval
>
>    (START/INIT/FROM??? ..., STEP ...)
>
> For lists of values, probably it would make little sense to have dynamic
> partitioning? Or maybe yes, if we could partition on a column
> value/expression?! eg "MOD(id, 8)"??
>
> What about pg_dump? Should it be able to regenerate the initial create?
>
I don't think this is needed for the proposed "Automatic partitioning (static)"
which generates a bunch of CREATE TABLE statements, IIUC.  Might be needed later
for "Automatic partitioning (dynamic)" where dynamic specifications need to be
stored.

> > [4] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements
>
> Good point, a wiki is better than a thread for that type of things. I'll
> look at this page.
+1

Regards,
Amul



Re: Proposal: Automatic partition creation

From
Anastasia Lubennikova
Date:
On 06.07.2020 19:10, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jul 6, 2020 at 6:46 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
I am going to implement this via SPI, which allow to simplify checks and
calculations. Do you see any pitfalls in this approach?
I don't really see why we need SPI here.
I would vote against any core facility that is implemented via SPI
queries.  It is just too darn hard to control the semantics completely in
the face of fun stuff like varying search_path.  Look at what a mess the
queries generated by the RI triggers are --- and they only have a very
small set of behaviors to worry about.  I'm still only about 95% confident
they don't have security issues, too.

If you're using SPI to try to look up appropriate operators, I think
the chances of being vulnerable to security problems are 100%.
Good to know, thank you for that. I had doubts about the internal usage of SPI,
but didn't know what exactly can go wrong.


I think the big problem here is identifying the operator to use. We
have no way of identifying the "plus" or "minus" operator associated
with a datatype; indeed, that constant doesn't exist.
We did indeed solve this in connection with window functions, cf
0a459cec9.  I may be misunderstanding what the problem is here,
but I think trying to reuse that infrastructure might help.

Do we need to introduce a new support function? Is there a reason why we can
not rely on '+' operator? I understand that the addition operator may lack or
be overloaded for some complex datatypes, but I haven't found any examples that
are useful for range partitioning. Both pg_pathman and pg_partman also use '+'
to generate bounds.

I explored the code a bit more and came up with this function, which is very
similar to generate_series_* functions, but it doesn't use SPI and looks for
the function that implements the '+' operator, instead of direct call:

// almost pseudocode

static Const *
generate_next_bound(Const *start, Const *interval)
{
    ObjectWithArgs *sum_oper_object = makeNode(ObjectWithArgs);

    sum_oper_object->type = OBJECT_OPERATOR;
    /* hardcode '+' operator for addition */
    sum_oper_object->objname = list_make1(makeString("+"));

    ltype = makeTypeNameFromOid(start->consttype, start->consttypmod);
    rtype = makeTypeNameFromOid(interval->consttype, interval->consttypmod);

    sum_oper_object->objargs = list_make2(ltype, rtype);

    sum_oper_oid = LookupOperWithArgs(sum_oper_object, false);
    oprcode = get_opcode(sum_oper_oid);
    fmgr_info(oprcode, &opproc);

    next_bound->constvalue = FunctionCall2(&opproc,
                             start->constvalue,
                             interval->constvalue);
}

Thoughts?

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Proposal: Automatic partition creation

From
Tom Lane
Date:
Anastasia Lubennikova <a.lubennikova@postgrespro.ru> writes:
> On 06.07.2020 19:10, Tom Lane wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> I think the big problem here is identifying the operator to use. We
>>> have no way of identifying the "plus" or "minus" operator associated
>>> with a datatype; indeed, that constant doesn't exist.

>> We did indeed solve this in connection with window functions, cf
>> 0a459cec9.  I may be misunderstanding what the problem is here,
>> but I think trying to reuse that infrastructure might help.

> Do we need to introduce a new support function? Is there a reason why we
> can not rely on '+' operator?

(1) the appropriate operator might not be named '+'
(2) even if it is, it might not be in your search_path
(3) you're vulnerable to security problems from someone capturing the
    '+' operator with a better match; since you aren't writing the
    operator explicitly, you can't fix that by qualifying it
(4) if the interval constant is written as an undecorated string
    literal, the parser may have trouble resolving a match at all

> I understand that the addition operator may lack or be overloaded for
> some complex datatypes, but I haven't found any examples that are useful
> for range partitioning.

"It works for all the built-in data types" isn't really a satisfactory
answer.  But even just in the built-in types, consider "date":

# select oid::regoperator from pg_operator where oprname ='+' and oprleft = 'date'::regtype;
              oid
--------------------------------
 +(date,interval)
 +(date,integer)
 +(date,time without time zone)
 +(date,time with time zone)
(4 rows)

It's not that immediately obvious which of these would make sense to use.

But the short answer here is that we did not accept relying on '+' being
the right thing for window function ranges, and I don't see why it is more
acceptable for partitioning ranges.  The existing places where our parser
relies on implicit operator names are, without exception, problematic [1].

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/ffefc172-a487-aa87-a0e7-472bf29735c8%40gmail.com



Re: Proposal: Automatic partition creation

From
Anastasia Lubennikova
Date:
On 06.07.2020 13:45, Anastasia Lubennikova wrote:
The previous discussion of automatic partition creation [1] has addressed static and dynamic creation of partitions and ended up with several syntax proposals.
In this thread, I want to continue this work.

...
[1] https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1907150711080.22273%40lancre

Syntax proposal v2, that takes into account received feedback.

I compared the syntax of other databases. You can find an overview here [1]. It
seems that there is no industry standard, so every DBMS has its own
implementation. I decided to rely on a Greenplum syntax, as the most similar to
the original PostgreSQL syntax.

New proposal is:

CREATE TABLE numbers(int number)
PARTITION BY partition_method (list_of_columns)
USING (partition_desc)

where partition_desc is:

MODULUS n
| VALUES IN (value_list), [DEFAULT PARTITION part_name]
| START ([datatype] 'start_value')  END ([datatype] 'end_value')  EVERY (partition_step), [DEFAULT PARTITION part_name]

where partition_step is:
[datatype] [number | INTERVAL] 'interval_value' 
example:

CREATE TABLE years(int year)
PARTITION BY RANGE (year)
USING
(START (2006) END (2016) EVERY (1),
DEFAULT PARTITION other_years);

It is less wordy than the previous version. It uses a free keyword option
style. It covers static partitioning for all methods, default partition for
list and range methods, and can be extended to implement dynamic partitioning
for range partitions.

[1] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Other_DBMS
[2] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Proposal_.28is_subject_to_change.29

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Proposal: Automatic partition creation

From
Anastasia Lubennikova
Date:
On 06.07.2020 17:59, Justin Pryzby wrote:
I think you'd want to have an
ALTER command for that (we would use that to change tables between
daily/monthly based on their current size).  That should also support setting
the MODULUS of a HASH partitioned table, to allow changing the size of its
partitions (currently, the user would have to more or less recreate the table
and move all its data into different partitions, but that's not ideal).
New syntax fits to the ALTER command as well.
ALTER TABLE tbl
PARTITION BY HASH (number)
USING (partition_desc)
In simple cases (i.e. range partitioning granularity), it will simply update
the rule of bound generation, saved in the catalog. More complex hash
partitions will require some rebalancing. Though, the syntax is pretty
straightforward for all cases. In the next versions, we can also add a
CONCURRENTLY keyword to cover partitioning of an existing non-partitioned table
with data.
I don't know if it's important for anyone, but it would be interesting to think
about supporting sub-partitioning: partitions which are themselvese partitioned.
Like something => something_YYYY => something_YYYY_MM => something_YYYY_MM_DD.
You'd need to specify how to partition each layer of the heirarchy.  In the
most general case, it could be different partition strategy.

I suppose it will be a natural extension of this work. Now we need to ensure
that the proposed syntax is extensible. Greenplum syntax, which I choose as an
example, provides subpartition syntax as well.

If you have a callback function for partition renaming, I think you'd want to
pass it not just the current name of the partition, but also the "VALUES" used
in partition creation.  Like (2020-04-05)TO(2020-05-06).  Maybe instead, we'd
allow setting a "format" to use to construct the partition name.  Like
"child.foo_bar_%Y_%m_%d".  Ideally, the formats would be fixed-length
(zero-padded, etc), so failures with length can happen at "parse" time of the
statement and not at "run" time of the creation.  You'd still have to handle
the case that the name already exists but isn't a partition (or is a partition
by doesn't handle the incoming tuple for some reason).

In callback design, I want to use the best from pg_pathman's set_init_callback().
The function accepts jsonb argument, which contains all the data about the
parent table, bounds, and so on. This information can be used to construct name
for the partition and generate RENAME statement.

Also, maybe your "configuration" syntax would allow specifying other values.
Maybe including a retention period (as an INTERVAL for RANGE tables).  That's
useful if you had a command to PRUNE the oldest partitions, like ALTER..PRUNE.
In this version, I got rid of the 'configuration' keyword. Speaking of
retention, I think that it would be hard to cover all use-cases with a
declarative syntax. While it is relatively easy to implement deletion within a
callback function. See rotation_callback example in pg_pathman [1].

[1] https://github.com/postgrespro/pg_pathman/blob/79e11d94a147095f6e131e980033018c449f8e2e/sql/pathman_callbacks.sql#L107
-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

[PATCH] Automatic HASH and LIST partition creation

From
Anastasia Lubennikova
Date:
On 14.07.2020 00:11, Anastasia Lubennikova wrote:
On 06.07.2020 13:45, Anastasia Lubennikova wrote:
The previous discussion of automatic partition creation [1] has addressed static and dynamic creation of partitions and ended up with several syntax proposals.
In this thread, I want to continue this work.

...
[1] https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1907150711080.22273%40lancre

Syntax proposal v2, that takes into account received feedback.

CREATE TABLE numbers(int number)
PARTITION BY partition_method (list_of_columns)
USING (partition_desc)

where partition_desc is:

MODULUS n
| VALUES IN (value_list), [DEFAULT PARTITION part_name]
| START ([datatype] 'start_value')  END ([datatype] 'end_value')  EVERY (partition_step), [DEFAULT PARTITION part_name]

where partition_step is:
[datatype] [number | INTERVAL] 'interval_value'

It is less wordy than the previous version. It uses a free keyword option
style. It covers static partitioning for all methods, default partition for
list and range methods, and can be extended to implement dynamic partitioning
for range partitions.

[1] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Other_DBMS
[2] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Proposal_.28is_subject_to_change.29

Here is the patch for automated HASH and LIST partitioning, that implements proposed syntax.

Range partitioning is more complicated. It will require new support function to calculate bounds, new catalog attribute to store them and so on. So I want to start small and implement automated range partitioning in a separate patch later.

1) Syntax

New syntax is heavily based on Greenplum syntax for automated partitioning with one change. Keyword "USING", that was suggested above, causes shift/reduce conflict with "USING method" syntax of a table access method. It seems that Greenplum folks will face this problem later.

I stick to CONFIGURATION as an existing keyword that makes sense in this context.
Any better ideas are welcome.

Thus, current version is:

CREATE TABLE table_name (attrs)
PARTITION BY partition_method (list_of_columns)
CONFIGURATION (partition_desc)

where partition_desc is:

MODULUS n
| VALUES IN (value_list) [DEFAULT PARTITION part_name]

This syntax can be easily extended for range partitioning as well.

2) Implementation

PartitionBoundAutoSpec is a new part of PartitionSpec, that contains information needed to generate partition bounds.

For HASH and LIST automatic partition creation, transformation happens during parse analysis of CREATE TABLE statement.
transformPartitionAutoCreate() calculates bounds and generates statements to create partition tables.

Partitions are named in a format: $tablename_$partnum. One can use post create hook to rename relations.

For LIST partition one can also define a default partition.

3) TODO

The patch lacks documentation, because I expect some details may change during discussion. Other than that, the feature is ready for review.


Regards

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:

The patch lacks documentation, because I expect some details may change during discussion. Other than that, the feature is ready for review.

Hi, hackers!

From what I've read I see there is much interest in automatic partitions creation. (Overall discussion on the topic is partitioned into two threads: (1) https://www.postgresql.org/message-id/alpine.DEB.2.21.1907150711080.22273%40lancre and (2) https://www.postgresql.org/message-id/flat/7fec3abb-c663-c0d2-8452-a46141be6d4a@postgrespro.ru (current thread) ) 

There were many syntax proposals and finally, there is a patch realizing one of them. So I'd like to review it.

The syntax proposed in the patch seems good enough for me and is in accordance with one of the proposals in the discussion. Maybe I'd prefer using the word AUTOMATICALLY/AUTO instead of CONFIGURATION with explicit meaning that using this syntax we'd get already (automatically) created partitions and don't need to create them manually, as in the existing state of postgresql declarative partitioning. 

CREATE TABLE tbl (i int) PARTITION BY HASH (i) AUTOMATICALLY (MODULUS 3); (partitions are created automatically)
vs
CREATE TABLE tbl (i int) PARTITION BY HASH (i); (partitions should be created manually by use of PARTITION OF)

CREATE TABLE tbl (i char) PARTITION BY LIST (i) AUTOMATICALLY (VALUES IN ('a', 'b'), ('c', 'd'), ('e','f') DEFAULT PARTITION tbl_default);
vs
CREATE TABLE tbl (i char) PARTITION BY LIST (i); (partitions should be created manually by use of PARTITION OF)

I think this syntax can also be extended later with adding automatic creation of RANGE partitions, with IMMEDIATE/DEFERRED for dynamic/on-demand automatic partition creation, and with SUBPARTITION possibility.

But I don't have a strong preference for the word AUTOMATICALLY, moreover I saw opposition to using AUTO at the top of the discussion. I suppose we can go with the existing CONFIGURATION word.

If compare with existing declarative partitions, I think automatic creation simplifies the process for the end-user and  I'd vote for its committing into Postgres. The patch is short and clean in code style. It has enough comments Tests covering the new functionality are included. Yet it doesn't have documentation and I'd suppose it's worth adding it. Even if there will be syntax changes, I hope they will not be more than the replacement of several words. Current syntax is described in the text of a patch.

The patch applies cleanly and installcheck-world is passed. 

Some minor things:

I've got a compiler warning:
parse_utilcmd.c:4280:15: warning: unused variable 'lc' [-Wunused-variable]

When the number of partitions is over the maximum value of int32 the output shows a generic syntax error. I don't think it is very important as it is not the case someone will make deliberately, but maybe it's better to output something like "Partitions number is more than the maximum supported value"
create table test (i int, t text) partition by hash (i) configuration (modulus 888888888888);
ERROR:  syntax error at or near "888888888888" 

One more piece of nitpicking. Probably we can go just with a mention in documentation. 
create table test (i int, t text) partition by hash (i) configuration (modulus 8888);
ERROR:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.

Typo:
+ /* Add statemets to create each partition after we create parent table */

Overall I see the patch almost ready for commit and I'd like to meet this functionality in v14.

Tested it and see this feature very cool and much simpler to use compared to declarative partitioning to date.

Thanks!
--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Anastasia Lubennikova
Date:
On 08.09.2020 17:03, Pavel Borisov wrote:

The patch lacks documentation, because I expect some details may change during discussion. Other than that, the feature is ready for review.

Hi, hackers!

From what I've read I see there is much interest in automatic partitions creation. (Overall discussion on the topic is partitioned into two threads: (1) https://www.postgresql.org/message-id/alpine.DEB.2.21.1907150711080.22273%40lancre and (2) https://www.postgresql.org/message-id/flat/7fec3abb-c663-c0d2-8452-a46141be6d4a@postgrespro.ru (current thread) ) 

There were many syntax proposals and finally, there is a patch realizing one of them. So I'd like to review it.

The syntax proposed in the patch seems good enough for me and is in accordance with one of the proposals in the discussion. Maybe I'd prefer using the word AUTOMATICALLY/AUTO instead of CONFIGURATION with explicit meaning that using this syntax we'd get already (automatically) created partitions and don't need to create them manually, as in the existing state of postgresql declarative partitioning. 

CREATE TABLE tbl (i int) PARTITION BY HASH (i) AUTOMATICALLY (MODULUS 3); (partitions are created automatically)
vs
CREATE TABLE tbl (i int) PARTITION BY HASH (i); (partitions should be created manually by use of PARTITION OF)
 
CREATE TABLE tbl (i char) PARTITION BY LIST (i) AUTOMATICALLY (VALUES IN ('a', 'b'), ('c', 'd'), ('e','f') DEFAULT PARTITION tbl_default);
vs
CREATE TABLE tbl (i char) PARTITION BY LIST (i); (partitions should be created manually by use of PARTITION OF)

I think this syntax can also be extended later with adding automatic creation of RANGE partitions, with IMMEDIATE/DEFERRED for dynamic/on-demand automatic partition creation, and with SUBPARTITION possibility.

But I don't have a strong preference for the word AUTOMATICALLY, moreover I saw opposition to using AUTO at the top of the discussion. I suppose we can go with the existing CONFIGURATION word.

I agree that 'AUTOMATICALLY' keyword is more specific and probably less confusing for users. I've picked 'CONFIGURATION' simply because it is an already existing keyword. It would like to hear other opinions on that.


If compare with existing declarative partitions, I think automatic creation simplifies the process for the end-user and  I'd vote for its committing into Postgres. The patch is short and clean in code style. It has enough comments Tests covering the new functionality are included. Yet it doesn't have documentation and I'd suppose it's worth adding it. Even if there will be syntax changes, I hope they will not be more than the replacement of several words. Current syntax is described in the text of a patch.


Fair enough. New patch contains a documentation draft. While writing it, I also noticed, that the syntax, introduced in the patch differs from greenpulm one. For now, list partitioning clause doesn't support 'PARTITION name' part, that is supported in greenplum. I don't think that we aim for 100% compatibility here. Still, the ability to provide table names is probably a good optional feature, especially for list partitions.

What do you think?

The patch applies cleanly and installcheck-world is passed. 

Some minor things:

I've got a compiler warning:
parse_utilcmd.c:4280:15: warning: unused variable 'lc' [-Wunused-variable]

Fixed. This was also caught by cfbot. This version should pass it clean.


When the number of partitions is over the maximum value of int32 the output shows a generic syntax error. I don't think it is very important as it is not the case someone will make deliberately, but maybe it's better to output something like "Partitions number is more than the maximum supported value"
create table test (i int, t text) partition by hash (i) configuration (modulus 888888888888);
ERROR:  syntax error at or near "888888888888"

This value is not a valid int32 number, thus parser throws the error before we have a chance to handle it more gracefully.


One more piece of nitpicking. Probably we can go just with a mention in documentation. 
create table test (i int, t text) partition by hash (i) configuration (modulus 8888);
ERROR:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.

Well, it looks like a legit error, when we try to lock a lot of objects in one transaction. I will double check if we don't release a lock somewhere.

Do we need to restrict the number of partitions, that can be created by this statement? With what number?  As far as I see, there is no such restriction for now, just a recommendation about performance issues. With automatic creation it becomes easier to mess with it.

Probably, it's enough to mention it in documentation and rely on users common sense.

Typo:
+ /* Add statemets to create each partition after we create parent table */

Fixed.

Overall I see the patch almost ready for commit and I'd like to meet this functionality in v14.
I also hope that this patch will make it to v14, but for now, I don't see a consensus on the syntax and some details, so I wouldn't rush.

Besides, it definitely needs more testing. I haven't thoroughly tested following cases yet:
- how triggers and constraints are propagated to partitions;
- how does it handle some tricky clauses in list partitioning expr_list;
and so on.

Also, there is an open question about partition naming. Currently, the patch implements dummy %tbl_%partnum name generation, which is far from user-friendly. I think we must provide some hook or trigger function to rename partitions after they were created.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: [PATCH] Automatic HASH and LIST partition creation

From
Michael Paquier
Date:
On Mon, Sep 14, 2020 at 02:38:56PM +0300, Anastasia Lubennikova wrote:
> Fixed. This was also caught by cfbot. This version should pass it clean.

Please note that regression tests are failing, because of 6b2c4e59.
--
Michael

Attachment

Re: [PATCH] Automatic HASH and LIST partition creation

From
Anastasia Lubennikova
Date:
On 24.09.2020 06:27, Michael Paquier wrote:
> On Mon, Sep 14, 2020 at 02:38:56PM +0300, Anastasia Lubennikova wrote:
>> Fixed. This was also caught by cfbot. This version should pass it clean.
> Please note that regression tests are failing, because of 6b2c4e59.
> --
> Michael

Thank you. Updated patch is attached.

Open issues for review:
- new syntax;
- generation of partition names;
- overall patch review and testing, especially with complex partitioning 
clauses.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: [PATCH] Automatic HASH and LIST partition creation

From
Rahila Syed
Date:
Hi Anastasia,

I tested the syntax with some basic commands and it works fine, regression tests also pass.

Couple of comments: 
1. The syntax used omits the { IMMEDIATE | DEFERRED} keywords suggested in 
the earlier discussions. I think it is intuitive to include IMMEDIATE with the current implementation
so that the syntax can be extended with a  DEFERRED clause in future for dynamic partitions.
  CREATE TABLE tbl_lst (i int) PARTITION BY LIST (i)
 CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION tbl_default);
 
2. One suggestion for generation of partition names is to append a unique id to
avoid conflicts.

3. Probably, here you mean to write list and hash instead of range and list as 
per the current state. 
     <para>
     Range and list partitioning also support automatic creation of partitions
      with an optional <literal>CONFIGURATION</literal> clause.
    </para>
 
4. Typo in default_part_name

+VALUES IN ( <replaceable class="parameter">partition_bound_expr</replaceable> [, ...] ), [( <replaceable class="parameter">partition_bound_expr</replaceable> [, ...] )] [, ...] [DEFAULT PARTITION <replaceable class="parameter">defailt_part_name</replaceable>]
+MODULUS <replaceable class="parameter">numeric_literal</replaceable>


Thank you,
Rahila Syed

Re: [PATCH] Automatic HASH and LIST partition creation

From
Anastasia Lubennikova
Date:
On 30.09.2020 22:58, Rahila Syed wrote:
Hi Anastasia,

I tested the syntax with some basic commands and it works fine, regression tests also pass.

Thank you for your review.
Couple of comments: 
1. The syntax used omits the { IMMEDIATE | DEFERRED} keywords suggested in 
the earlier discussions. I think it is intuitive to include IMMEDIATE with the current implementation
so that the syntax can be extended with a  DEFERRED clause in future for dynamic partitions.
  CREATE TABLE tbl_lst (i int) PARTITION BY LIST (i)
 CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION tbl_default);
 
After some consideration, I decided that we don't actually need to introduce IMMEDIATE | DEFERRED keyword. For hash and list partitions it will always be immediate, as the number of partitions cannot change after we initially set it. For range partitions, on the contrary, it doesn't make much sense to make partitions immediately, because in many use-cases one bound will be open.

2. One suggestion for generation of partition names is to append a unique id to
avoid conflicts.

Can you please give an example of such a conflict? I agree that current naming scheme is far from perfect, but I think that 'tablename'_partnum provides unique name for each partition.


3. Probably, here you mean to write list and hash instead of range and list as 
per the current state. 
     <para>
     Range and list partitioning also support automatic creation of partitions
      with an optional <literal>CONFIGURATION</literal> clause.
    </para>
 
4. Typo in default_part_name

+VALUES IN ( <replaceable class="parameter">partition_bound_expr</replaceable> [, ...] ), [( <replaceable class="parameter">partition_bound_expr</replaceable> [, ...] )] [, ...] [DEFAULT PARTITION <replaceable class="parameter">defailt_part_name</replaceable>]
+MODULUS <replaceable class="parameter">numeric_literal</replaceable>

Yes, you're right. I will fix these typos in next version of the patch.

Thank you,
Rahila Syed


-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [PATCH] Automatic HASH and LIST partition creation

From
Rahila Syed
Date:

Hi, 

Couple of comments: 
1. The syntax used omits the { IMMEDIATE | DEFERRED} keywords suggested in 
the earlier discussions. I think it is intuitive to include IMMEDIATE with the current implementation
so that the syntax can be extended with a  DEFERRED clause in future for dynamic partitions.
  CREATE TABLE tbl_lst (i int) PARTITION BY LIST (i)
 CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION tbl_default);
 
After some consideration, I decided that we don't actually need to introduce IMMEDIATE | DEFERRED keyword. For hash and list partitions it will always be immediate, as the number of partitions cannot change after we initially set it. For range partitions, on the contrary, it doesn't make much sense to make partitions immediately, because in many use-cases one bound will be open.


DEFERRED clause refers to creating partitions on the fly, while the data is being inserted. 
The number of partitions and partition bounds can be the same as specified initially 
during partitioned table creation, but the actual creation of partitions can be deferred. 
This seems like a potential extension to statically created partitions even in the case of 
hash and list partitions, as it won't involve moving any existing data.

     2. One suggestion for generation of partition names is to append a unique id to
avoid conflicts.

Can you please give an example of such a conflict? I agree that current naming scheme is far from perfect, but I think that 'tablename'_partnum provides unique name for each partition.


Sorry for not being clear earlier, I mean the partition name 'tablename_partnum' can conflict with any existing table name. 
As per current impemetation, if I do the following it results in the table name conflict.

postgres=# create table tbl_test_5_1(i int);
CREATE TABLE
postgres=# CREATE TABLE tbl_test_5 (i int) PARTITION BY LIST((tbl_test_5))                                                                                                               CONFIGURATION (values in ('(1)'::tbl_test_5), ('(3)'::tbl_test_5) default partition tbl_default_5);
ERROR:  relation "tbl_test_5_1" already exists 
 
Thank you,
Rahila Syed

Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
Sorry for not being clear earlier, I mean the partition name 'tablename_partnum' can conflict with any existing table name. 
As per current impemetation, if I do the following it results in the table name conflict.

postgres=# create table tbl_test_5_1(i int);
CREATE TABLE
postgres=# CREATE TABLE tbl_test_5 (i int) PARTITION BY LIST((tbl_test_5))                                                                                                               CONFIGURATION (values in ('(1)'::tbl_test_5), ('(3)'::tbl_test_5) default partition tbl_default_5);
ERROR:  relation "tbl_test_5_1" already exists 

Basically, it's the same thing when you try to create two tables with the same name. It is not specific to partition creation and common for every case that using any defaults, they can conflict with something existing. And in this case this conflict is explicitly processes as I see from output message.

In fact in PG there are other places when names are done in default way e.g. in aggregates regression test it is not surprise to find in PG13:

explain (costs off)
  select min(f1), max(f1) from minmaxtest;
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
 Result
   InitPlan 1 (returns $0)
     ->  Limit
           ->  Merge Append
                 Sort Key: minmaxtest.f1
                 ->  Index Only Scan using minmaxtesti on minmaxtest minmaxtest_1
                       Index Cond: (f1 IS NOT NULL)
                 ->  Index Only Scan using minmaxtest1i on minmaxtest1 minmaxtest_2
                       Index Cond: (f1 IS NOT NULL)
                 ->  Index Only Scan Backward using minmaxtest2i on minmaxtest2 minmaxtest_3
                       Index Cond: (f1 IS NOT NULL)
                 ->  Index Only Scan using minmaxtest3i on minmaxtest3 minmaxtest_4
   InitPlan 2 (returns $1)
     ->  Limit
           ->  Merge Append
                 Sort Key: minmaxtest_5.f1 DESC
                 ->  Index Only Scan Backward using minmaxtesti on minmaxtest minmaxtest_6
                       Index Cond: (f1 IS NOT NULL)
                 ->  Index Only Scan Backward using minmaxtest1i on minmaxtest1 minmaxtest_7
                       Index Cond: (f1 IS NOT NULL)
                 ->  Index Only Scan using minmaxtest2i on minmaxtest2 minmaxtest_8
                       Index Cond: (f1 IS NOT NULL)
                 ->  Index Only Scan Backward using minmaxtest3i on minmaxtest3 minmaxtest_9 

where minmaxtest_<number> are the temporary relations and minmaxtest<number> are real partition names (last naming is unrelated to first)

Overall I don't see much trouble in any form of automatic naming. But there may be a convenience to provide fixed user-specified prefix to partition names.

Thank you,
--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
Hi, hackers!
I added some extra tests for different cases of use of automatic partition creation.
v3-0002 can be applied on top of the original v2 patch for correct work with some corner cases with constraints included in this test.

As for immediate/deferred I think that now only available now is immediate, so using word IMMEDIATE seems a little bit redundant to me. We may introduce this word together with adding DEFERRED option. However, my opinion is not in strong opposition to both options. Оther opinions are very much welcome!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Automatic HASH and LIST partition creation

From
Anastasia Lubennikova
Date:
On 06.10.2020 00:21, Pavel Borisov wrote:
> Hi, hackers!
> I added some extra tests for different cases of use of automatic 
> partition creation.
> v3-0002 can be applied on top of the original v2 patch for correct 
> work with some corner cases with constraints included in this test.
>
Thank you for the tests. I've added them and the fix into the patch.

I also noticed, that some table parameters, such as persistence were not 
promoted to auto generated partitions. This is fixed now. The test cases 
for temp and unlogged auto partitioned tables are updated respectively.
Besides, I slightly refactored the code and fixed documentation typos, 
that were reported by Rahila.

With my recent changes, one test statement, that you've added as 
failing, works.

CREATE TABLE list_parted_fail (a int) PARTITION BY LIST (a) CONFIGURATION
(VALUES IN ('1' collate "POSIX"));

It simply ignores collate POSIX part and creates a table with following 
structure:


                        Partitioned table "public.list_parted_fail"
  Column |  Type   | Collation | Nullable | Default | Storage | Stats 
target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
  a      | integer |           |          |         | plain 
|              |
Partition key: LIST (a)
Partitions: list_parted_fail_0 FOR VALUES IN (1)

Do you think that it is a bug? For now, I removed this statement from 
tests just to calm down the CI.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
Do you think that it is a bug? For now, I removed this statement from
tests just to calm down the CI. 

It is in accordance with changes in tests for vanilla decralarive partitioning as per

commit 2dfa3fea88bc951d0812a18649d801f07964c9b9
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Mon Sep 28 13:44:01 2020 -0400
    Remove complaints about COLLATE clauses in partition bound values. 

which my test does for automatic way in the same style. So I consider your removal completely correct.

Thank you!

Re: [PATCH] Automatic HASH and LIST partition creation

From
Anastasia Lubennikova
Date:
On 05.10.2020 09:36, Rahila Syed wrote:

Hi, 

Couple of comments: 
1. The syntax used omits the { IMMEDIATE | DEFERRED} keywords suggested in 
the earlier discussions. I think it is intuitive to include IMMEDIATE with the current implementation
so that the syntax can be extended with a  DEFERRED clause in future for dynamic partitions.
  CREATE TABLE tbl_lst (i int) PARTITION BY LIST (i)
 CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION tbl_default);
 
After some consideration, I decided that we don't actually need to introduce IMMEDIATE | DEFERRED keyword. For hash and list partitions it will always be immediate, as the number of partitions cannot change after we initially set it. For range partitions, on the contrary, it doesn't make much sense to make partitions immediately, because in many use-cases one bound will be open.


DEFERRED clause refers to creating partitions on the fly, while the data is being inserted. 
The number of partitions and partition bounds can be the same as specified initially 
during partitioned table creation, but the actual creation of partitions can be deferred. 
This seems like a potential extension to statically created partitions even in the case of 
hash and list partitions, as it won't involve moving any existing data.

Oh, now I see what you mean. The case with already existing tables will require changes to ALTER TABLE syntax. And that's where we may want to choose between immediate (i.e. locking) and deferred (i.e. concurrent) creation of partitions. I think we should try to implement it with existing keywords, maybe use 'CONCURRENTLY' keyword and it will look like:

ALTER TABLE tbl PARTITION BY ... CONFIGURATION (....) [CONCURRENTLY];

Anyway, the task of handling existing data is much more complicated, especially the 'concurrent' case and to be honest, I haven't put much thought into it yet.

The current patch only implements the simplest case of creating a new partitioned table. And I don't see if CREATE TABLE needs this immediate|deferred clause or if it will need it in the future.

Thoughts?


     2. One suggestion for generation of partition names is to append a unique id to
avoid conflicts.

Can you please give an example of such a conflict? I agree that current naming scheme is far from perfect, but I think that 'tablename'_partnum provides unique name for each partition.


Sorry for not being clear earlier, I mean the partition name 'tablename_partnum' can conflict with any existing table name. 
As per current impemetation, if I do the following it results in the table name conflict.

postgres=# create table tbl_test_5_1(i int);
CREATE TABLE
postgres=# CREATE TABLE tbl_test_5 (i int) PARTITION BY LIST((tbl_test_5))                                                                                                               CONFIGURATION (values in ('(1)'::tbl_test_5), ('(3)'::tbl_test_5) default partition tbl_default_5);
ERROR:  relation "tbl_test_5_1" already exists 
 


I don't mind adding some specific suffix for generated partitions, although it still may conflict with existing table names. The main disadvantage of this idea, is that it reduces number of symbols available for table name, which can lead to something like this:

CREATE TABLE parteddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd (a text, b int NOT NULL DEFAULT 0,  CONSTRAINT check_aa CHECK (length(a) > 0))
PARTITION BY LIST (a) CONFIGURATION (VALUES IN ('a','b'),('c','d') DEFAULT PARTITION parted_def) ;;
NOTICE:  identifier "parteddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd" will be truncated to "partedddddddddddddddddddddddddddddddddddddddddddddddddddddddddd"
ERROR:  relation "partedddddddddddddddddddddddddddddddddddddddddddddddddddddddddd" already exists

The error message here is a bit confusing, as relation 'partedddddddddddddddddddddddddddddddddddddddddddddddddddddddddd' haven't existed before and this is a conflict between partitioned and generated partition table name. For now, I don't know if we can handle it more gracefully. Probably, we could truncate tablename to a shorter size, but it doesn't provide a complete solution, because partition number can contain several digits.

See also pg_partman documentation on the same issue: https://github.com/pgpartman/pg_partman/blob/master/doc/pg_partman.md#naming-length-limits

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:

     2. One suggestion for generation of partition names is to append a unique id to
avoid conflicts.

Can you please give an example of such a conflict? I agree that current naming scheme is far from perfect, but I think that 'tablename'_partnum provides unique name for each partition.


Sorry for not being clear earlier, I mean the partition name 'tablename_partnum' can conflict with any existing table name. 
As per current impemetation, if I do the following it results in the table name conflict.

postgres=# create table tbl_test_5_1(i int);
CREATE TABLE
postgres=# CREATE TABLE tbl_test_5 (i int) PARTITION BY LIST((tbl_test_5))                                                                                                               CONFIGURATION (values in ('(1)'::tbl_test_5), ('(3)'::tbl_test_5) default partition tbl_default_5);
ERROR:  relation "tbl_test_5_1" already exists 
 


I don't mind adding some specific suffix for generated partitions, although it still may conflict with existing table names. The main disadvantage of this idea, is that it reduces number of symbols available for table name, which can lead to something like this:

CREATE TABLE parteddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd (a text, b int NOT NULL DEFAULT 0,  CONSTRAINT check_aa CHECK (length(a) > 0))
PARTITION BY LIST (a) CONFIGURATION (VALUES IN ('a','b'),('c','d') DEFAULT PARTITION parted_def) ;;
NOTICE:  identifier "parteddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd" will be truncated to "partedddddddddddddddddddddddddddddddddddddddddddddddddddddddddd"
ERROR:  relation "partedddddddddddddddddddddddddddddddddddddddddddddddddddddddddd" already exists
doc/pg_partman.md#naming-length-limits

It seems to me that a working idea is to add a prefix to partitions is to give the possibility to specify it for users. So the user will be able to choose appropriate and not very long suffix to avoid conflicts.
Maybe like this:
CREATE TABLE city (a text) PARTITION BY LIST (a) CONFIGURATION (VALUES IN ('a','b'),('c','d') DEFAULT PARTITION city_other PREFIX _prt) ;

Result:
---
city_prt1
city_prt2 
...
city_other

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
Again I've checked v3 patch. In the discussion, there are several other ideas on its further development, so I consider the patch as the first step to later progress. Though now the patch is fully self-sufficient in functionality and has enough tests etc. I suppose it is ready to be committed. 

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
I've realized one strange effect in current grammar parsing: if I do

CREATE TABLE foo (a int) PARTITION BY LIST (a) CONFIGURATION (a 1);
ERROR:  unrecognized auto partition bound specification "a"

I consulted the patch code and realized that in fact, the patch considers it the (invalid) HASH bounds (doesn't find a word 'modulus') unless it is specified to be (still invalid) LIST. This is due to the fact that the grammar parser is not context-aware and in the patch, we tried to avoid the new parser keyword MODULUS. The effect is that inside a CONFIGURATION parentheses in case of HASH bounds we don't have a single keyword for the parser to determine it is really a HASH case.

It doesn't make the patch work wrongly, besides it checks the validity of all types of bounds in the HASH case even when the partitioning is not HASH. I find this slightly bogus. This is because the parser can not determine the type of partitioning inside the configuration clause and this makes adding new syntax (e.g. adding RANGE partitioning configuration inside CONFIGURATION parentheses) complicated.

So I have one more syntax proposal: to have separate keywords inside CONFIGURATION parentheses for each partitioning type.
E.g:
CREATE TABLE foo(a int) PARTITION BY LIST(a) CONFIGURATION (FOR VALUES IN (1,2),(3,4) DEFAULT PARTITION foo_def);
CREATE TABLE foo(a int) PARTITION BY HASH(a) CONFIGURATION (FOR VALUES WITH MODULUS 3);
CREATE TABLE foo(a int) PARTITION BY RAGE(a) CONFIGURATION (FOR VALUES FROM 1 TO 1000 INTERVAL 10 DEFAULT PARTITION foo_def);

This proposal is in accordance with the current syntax of declarative partitioning: CREATE TABLE foo_1 PARTITION OF foo FOR VALUES ...  

Some more facultative proposals incremental to the abovementioned:
1. Omit CONFIGURATION with/without parentheses. This makes syntax closer to (non-automatic) declarative partitioning syntax but the clause seems less legible (in my opinion). 
2. Omit just FOR VALUES. This makes the clause short, but adds a difference to (non-automatic) declarative partitioning syntax.

 I'm planning also to add RANGE partitioning syntax to this in the future and I will be happy if all three types of the syntax could come along easily.
 I very much appreciate your views on this especially regarding that changes can be still made easily because the patch is not committed yet.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Maxim Orlov
Date:
On 2020-12-18 21:54, Pavel Borisov wrote:
> I've realized one strange effect in current grammar parsing: if I do
> 
> CREATE TABLE foo (a int) PARTITION BY LIST (a) CONFIGURATION (a 1);
> ERROR:  unrecognized auto partition bound specification "a"
> 
> I consulted the patch code and realized that in fact, the patch
> considers it the (invalid) HASH bounds (doesn't find a word 'modulus')
> unless it is specified to be (still invalid) LIST. This is due to the
> fact that the grammar parser is not context-aware and in the patch, we
> tried to avoid the new parser keyword MODULUS. The effect is that
> inside a CONFIGURATION parentheses in case of HASH bounds we don't
> have a single keyword for the parser to determine it is really a HASH
> case.
> 
> It doesn't make the patch work wrongly, besides it checks the validity
> of all types of bounds in the HASH case even when the partitioning is
> not HASH. I find this slightly bogus. This is because the parser can
> not determine the type of partitioning inside the configuration clause
> and this makes adding new syntax (e.g. adding RANGE partitioning
> configuration inside CONFIGURATION parentheses) complicated.
> 
> So I have one more syntax proposal: to have separate keywords inside
> CONFIGURATION parentheses for each partitioning type.
> E.g:
> CREATE TABLE foo(a int) PARTITION BY LIST(a) CONFIGURATION (FOR VALUES
> IN (1,2),(3,4) DEFAULT PARTITION foo_def);
> CREATE TABLE foo(a int) PARTITION BY HASH(a) CONFIGURATION (FOR VALUES
> WITH MODULUS 3);
> CREATE TABLE foo(a int) PARTITION BY RAGE(a) CONFIGURATION (FOR VALUES
> FROM 1 TO 1000 INTERVAL 10 DEFAULT PARTITION foo_def);
> 
> This proposal is in accordance with the current syntax of declarative
> partitioning: CREATE TABLE foo_1 PARTITION OF foo FOR VALUES ...
> 
> Some more facultative proposals incremental to the abovementioned:
> 1. Omit CONFIGURATION with/without parentheses. This makes syntax
> closer to (non-automatic) declarative partitioning syntax but the
> clause seems less legible (in my opinion).
> 2. Omit just FOR VALUES. This makes the clause short, but adds a
> difference to (non-automatic) declarative partitioning syntax.
> 
>  I'm planning also to add RANGE partitioning syntax to this in the
> future and I will be happy if all three types of the syntax could come
> along easily.
> 
>  I very much appreciate your views on this especially regarding that
> changes can be still made easily because the patch is not committed
> yet.
> 
> --
> 
> Best regards,
> Pavel Borisov
> 
> Postgres Professional: http://postgrespro.com [1]
> 
> 
> Links:
> ------
> [1] http://www.postgrespro.com

In my view, next expressions are the golden ground here. On one hand, 
not far from the original
non-automatic declarative partitioning syntax syntax, on the other hand, 
omit CONFIGURATION
key-word (which is redundant here in terms of gram parsing) makes this 
expressions less apprehensible for the human.

CREATE TABLE foo(a int) PARTITION BY LIST(a) CONFIGURATION (FOR VALUES 
IN (1,2),(3,4) DEFAULT PARTITION foo_def);
CREATE TABLE foo(a int) PARTITION BY HASH(a) CONFIGURATION (FOR VALUES 
WITH MODULUS 3);
CREATE TABLE foo(a int) PARTITION BY RAGE(a) CONFIGURATION (FOR VALUES 
FROM 1 TO 1000 INTERVAL 10 DEFAULT PARTITION foo_def);

In addition to that, adding RANGE PARTITION would be much simpler since 
we would have specific "branches" in gram instead of using 
context-sensitive grammar and dealing with it in c-code.
---
Best regards,
Maxim Orlov.



Re: [PATCH] Automatic HASH and LIST partition creation

From
Fabien COELHO
Date:
> CREATE TABLE foo(a int) PARTITION BY LIST(a) CONFIGURATION (FOR VALUES IN 
> (1,2),(3,4) DEFAULT PARTITION foo_def);

I would like to disagree with this syntactic approach because it would 
very specific to each partition method. IMHO the syntax should be as 
generic as possible. I'd suggest (probably again) a keyword/value list 
which would allow to be quite adaptable without inducing any pressure on 
the parser.

-- 
Fabien.



Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
> CREATE TABLE foo(a int) PARTITION BY LIST(a) CONFIGURATION (FOR VALUES IN
> (1,2),(3,4) DEFAULT PARTITION foo_def);

I would like to disagree with this syntactic approach because it would
very specific to each partition method. IMHO the syntax should be as
generic as possible. I'd suggest (probably again) a keyword/value list
which would allow to be quite adaptable without inducing any pressure on
the parser.
If I remember your proposal correctly it is something like
CREATE TABLE foo(...) PARTITION BY HASH AUTOMATIC (MODULUS 10);

It is still possible but there are some caveats:
1. We'll need to add keyword MODULUS (and probably AUTOMATIC) to the parser's list. I don't against this but as far as I've heard there is some opposition among PG community against new keywords. Maybe I am wrong.
2. The existing syntax for declarative partitioning is different to your proposal. It is still not a big problem and your proposal makes query shorter for several words. I'd just like to see some consensus on the syntax. Now I must admit there are too many contradictions in opinions which make progress slow. Also I think it is important to have a really convenient syntaх.
2a Maybe we all who participated in the thread can vote for some variant?
2b Maybe the existing syntax for declarative partitioniong should be given some priority as it is already committed into CREATE TABLE ... PARTITION OF ... FOR VALUES IN.. etc.

I'd be happy if everyone will join some version of the proposed syntaх in this thread and in the previous discussion [1]. If we have a variant with more than one supporter, sure we can develop patch based on it.
Thank you very much
and Merry Christmas!


-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Fabien COELHO
Date:
HEllo.

>>> CREATE TABLE foo(a int) PARTITION BY LIST(a) CONFIGURATION (FOR VALUES
>> IN
>>> (1,2),(3,4) DEFAULT PARTITION foo_def);
>>
>> I would like to disagree with this syntactic approach because it would
>> very specific to each partition method. IMHO the syntax should be as
>> generic as possible. I'd suggest (probably again) a keyword/value list
>> which would allow to be quite adaptable without inducing any pressure on
>> the parser.
>>
> If I remember your proposal correctly it is something like
> CREATE TABLE foo(...) PARTITION BY HASH AUTOMATIC (MODULUS 10);

Yep, that would be the spirit.

> It is still possible but there are some caveats: 1. We'll need to add 
> keyword MODULUS (and probably AUTOMATIC) to the parser's list.

Why? We could accept anything in the list? i.e.:

    (ident =? value[, ident =? value]*)

> I don't against this but as far as I've heard there is some
> opposition among PG community against new keywords. Maybe I am wrong.

the ident is a keyword that can be interpreted later on, not a "reserved 
keyword" from a parser perspective, which is the only real issue?

The parser does not need to know about it, only the command interpreter 
which will have to interpret it. AUTOMATIC is a nice parser cue to 
introduce such a ident-value list.

> 2. The existing syntax for declarative partitioning is different to your
> proposal.

Yep. I think that it was not so good a design choice from a 
language/extensibility perspective.

> It is still not a big problem and your proposal makes query
> shorter for several words. I'd just like to see some consensus on the
> syntax. Now I must admit there are too many contradictions in opinions
> which make progress slow. Also I think it is important to have a really
> convenient syntaх.



> 2a Maybe we all who participated in the thread can vote for some variant?
> 2b Maybe the existing syntax for declarative partitioniong should be given
> some priority as it is already committed into CREATE TABLE ... PARTITION OF
> ... FOR VALUES IN.. etc.

> I'd be happy if everyone will join some version of the proposed syntaх in
> this thread and in the previous discussion [1]. If we have a variant with
> more than one supporter, sure we can develop patch based on it.
> Thank you very much
> and Merry Christmas!
>
> [1]
> https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1907150711080.22273%40lancre
>
>

-- 
Fabien.

Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:


Why? We could accept anything in the list? i.e.:

    (ident =? value[, ident =? value]*)

> I don't against this but as far as I've heard there is some
> opposition among PG community against new keywords. Maybe I am wrong.

the ident is a keyword that can be interpreted later on, not a "reserved
keyword" from a parser perspective, which is the only real issue?

The parser does not need to know about it, only the command interpreter
which will have to interpret it. AUTOMATIC is a nice parser cue to
introduce such a ident-value list.

> 2. The existing syntax for declarative partitioning is different to your
> proposal.

Yep. I think that it was not so good a design choice from a
language/extensibility perspective.
Thank you very much, Fabien. It is clear enough.
BTW could you tell me a couple of words about pros and cons of c-code syntax parsing comparing to parsing using gram.y trees? I think both are possible but my predisposition was that we'd better use the later if possible.

Best regards,
Pavel Borisov

Re: [PATCH] Automatic HASH and LIST partition creation

From
Fabien COELHO
Date:
> BTW could you tell me a couple of words about pros and cons of c-code
> syntax parsing comparing to parsing using gram.y trees?

I'd rather use an automatic tool (lexer/parser) if possible instead of 
doing it by hand if I can. If you want a really nice syntax with clever 
tricks, then you may need to switch to manual though, but pg/sql is not in 
that class.

> I think both are possible but my predisposition was that we'd better use 
> the later if possible.

I agree.

-- 
Fabien.



Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:

> BTW could you tell me a couple of words about pros and cons of c-code
> syntax parsing comparing to parsing using gram.y trees?

I'd rather use an automatic tool (lexer/parser) if possible instead of
doing it by hand if I can. If you want a really nice syntax with clever
tricks, then you may need to switch to manual though, but pg/sql is not in
that class.

> I think both are possible but my predisposition was that we'd better use
> the later if possible.

I agree.
Thank you!

Fabien, do you consider it possible to change the syntax of declarative partitioning too? It is problematic as it is already committed but also is very tempting to have the same type of syntax both in automatic partitioning and in manual (PARTITION OF...)

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Fabien COELHO
Date:
> Fabien, do you consider it possible to change the syntax of declarative
> partitioning too?

My 0.02 €: What I think does not matter much, what committers think is the 
way to pass something. However, I do not think that such an idea would 
pass a committer:-)

> It is problematic as it is already committed but also is very tempting 
> to have the same type of syntax both in automatic partitioning and in 
> manual (PARTITION OF...)

I think that if a "common" syntax, for a given meaning of common, can be 
thought of, and without breaking backward compatibility, then there may be 
an argument to provide such a syntax, but I would not put too much energy 
into that if I were you.

I see 3 cases:

  - partition declaration but no actual table generated, the current
    version.

  - partition declaration with actual sub-tables generated, eg for hash
    where it is pretty straightforward to know what would be needed, or for
    a bounded range.

  - partition declaration without generated table, but they are generated
    on demand, when needed, for a range one may want weekly or monthly
    without creating tables in advance, esp. if it is unbounded.

ISTM that the syntax should be clear and if possible homogeneous for all 
three use cases, even if they are not implemented yet. It should also 
allow easy extensibility, hence something without a strong syntax, 
key/value pairs to be interpreted later.

-- 
Fabien.

Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
My 0.02 €: What I think does not matter much, what committers think is the
way to pass something. However, I do not think that such an idea would
pass a committer:-)

The same idea was the reason for my proposal to make automatic partitioning clauses to be in accordance with existing declarative syntax (even if it seems little bit long to write words "configuration (for values" )
 
CREATE TABLE foo(a int) PARTITION BY LIST(a) CONFIGURATION (FOR VALUES
IN (1,2),(3,4) DEFAULT PARTITION foo_def);
CREATE TABLE foo(a int) PARTITION BY HASH(a) CONFIGURATION (FOR VALUES
WITH MODULUS 3);
CREATE TABLE foo(a int) PARTITION BY RAGE(a) CONFIGURATION (FOR VALUES
FROM 1 TO 1000 INTERVAL 10 DEFAULT PARTITION foo_def)

If we want generic (ident = value,...) then we need to introduce different to what is already committed for manual partitioning which I considered worse than my proposal above. Still other opinions are highly valued.
-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Thomas Munro
Date:
On Wed, Oct 7, 2020 at 6:26 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
> Do you think that it is a bug? For now, I removed this statement from
> tests just to calm down the CI.

I don't think we can use \d+ on a temporary table here, because the
backend ID appears in the namespace, which is causing a failure on one
of the CI OSes due to nondeterminism:

CREATE TEMP TABLE temp_parted (a char) PARTITION BY LIST (a)
CONFIGURATION (VALUES IN ('a') DEFAULT PARTITION temp_parted_default);
\d+ temp_parted
- Partitioned table "pg_temp_3.temp_parted"
+ Partitioned table "pg_temp_4.temp_parted"



Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
I don't think we can use \d+ on a temporary table here, because the
backend ID appears in the namespace, which is causing a failure on one
of the CI OSes due to nondeterminism:

CREATE TEMP TABLE temp_parted (a char) PARTITION BY LIST (a)
CONFIGURATION (VALUES IN ('a') DEFAULT PARTITION temp_parted_default);
\d+ temp_parted
- Partitioned table "pg_temp_3.temp_parted"
+ Partitioned table "pg_temp_4.temp_parted"

I've updated the tests accordingly. PFA version 4.  
As none of the recent proposals to modify the syntax were seconded by anyone, I return the previous Ready-for-committer CF status.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Automatic HASH and LIST partition creation

From
Nitin Jadhav
Date:
I have reviewed the v4 patch. The patch does not get applied on the latest source. Kindly rebase. 
However I have found few comments.

1.
> +-- must fail because of wrong configuration
> +CREATE TABLE tbl_hash_fail (i int) PARTITION BY HASH (i)
> +CONFIGURATION (values in (1, 2), (3, 4) default partition tbl_default);

Here some of the keywords are mentioned in UPPER CASE (Ex: CREATE TABLE, CONFIGURATION, etc) and some are mentioned in lower case (Ex: values in, default partition, etc). Kindly make it common. I feel making it to UPPER CASE is better. Please take care of this in all the cases.

2. It is better to separate the failure cases and success cases in /src/test/regress/sql/create_table.sql for better readability. All the failure cases can be listed first and then the success cases.

3. 
> +           char *part_relname;
> +
> +           /*
> +            * Generate partition name in the format:
> +            * $relname_$partnum
> +            * All checks of name validity will be made afterwards in DefineRelation()
> +            */
> +           part_relname = psprintf("%s_%d", cxt->relation->relname, i);

The assignment can be done directly while declaring the variable.

Thanks & Regards,
Nitin Jadhav

On Wed, Mar 3, 2021 at 1:56 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
https://commitfest.postgresql.org/32/2694/

I don't know what committers will say, but I think that "ALTER TABLE" might be
the essential thing for this patch to support, not "CREATE".  (This is similar
to ALTER..SET STATISTICS, which is not allowed in CREATE.)

The reason is that ALTER is what's important for RANGE partitions, which need
to be created dynamically (for example, to support time-series data
continuously inserting data around 'now').  I assume it's sometimes also
important for LIST.  I think this patch should handle those cases better before
being commited, or else we risk implementing grammar and other user-facing interface
that fails to handle what's needed into the future (or that's non-essential).
Even if dynamic creation isn't implemented yet, it seems important to at least
implement the foundation for setting the configuration to *allow* that in the
future, in a manner that's consistent with the initial implementation for
"static" partitions.

ALTER also supports other ideas I mentioned here:
https://www.postgresql.org/message-id/20200706145947.GX4107%40telsasoft.com

  - ALTER .. SET interval (for dynamic/deferred RANGE partitioning)
  - ALTER .. SET modulus, for HASH partitioning, in the initial implementation,
    this would allow CREATING paritions, but wouldn't attempt to handle moving
    data if overlapping table already exists:
  - Could also set the table-name, maybe by format string;
  - Could set "retention interval" for range partitioning;
  - Could set if the partitions are themselves partitioned(??)

I think once you allow setting configuration parameters like this, then you
might have an ALTER command to "effect" them, which would create any static
tables required by the configuration.  maybe that'd be automatic, but if
there's an "ALTER .. APPLY PARTITIONS" command (or whatever), maybe in the
future, the command could also be used to "repartition" existing table data
into partitions with more fine/course granularity (modulus, or daily vs monthly
range, etc).

--
Justin


Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
I have reviewed the v4 patch. The patch does not get applied on the latest source. Kindly rebase. 
However I have found few comments.

1.
> +-- must fail because of wrong configuration
> +CREATE TABLE tbl_hash_fail (i int) PARTITION BY HASH (i)
> +CONFIGURATION (values in (1, 2), (3, 4) default partition tbl_default);

Here some of the keywords are mentioned in UPPER CASE (Ex: CREATE TABLE, CONFIGURATION, etc) and some are mentioned in lower case (Ex: values in, default partition, etc). Kindly make it common. I feel making it to UPPER CASE is better. Please take care of this in all the cases.

2. It is better to separate the failure cases and success cases in /src/test/regress/sql/create_table.sql for better readability. All the failure cases can be listed first and then the success cases.

3. 
> +           char *part_relname;
> +
> +           /*
> +            * Generate partition name in the format:
> +            * $relname_$partnum
> +            * All checks of name validity will be made afterwards in DefineRelation()
> +            */
> +           part_relname = psprintf("%s_%d", cxt->relation->relname, i);

The assignment can be done directly while declaring the variable.
Thank you for your review! 
I've rebased the patch and made the changes mentioned. 
PFA v5.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Automatic HASH and LIST partition creation

From
John Naylor
Date:

On Fri, Jul 9, 2021 at 6:30 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:

> Thank you for your review!
> I've rebased the patch and made the changes mentioned.
> PFA v5.

I've set this back to "needs review" in CF.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
> Thank you for your review!
> I've rebased the patch and made the changes mentioned.
> PFA v5.

I've set this back to "needs review" in CF.
Thanks for the attention! I did the review of this patch, and the changes I've introduced in v5 are purely cosmetic. So I'd suppose the ready-for-committer status should not better have been changed.
So I'd like return it to ready-for-committer. If you mind against this, please mention. The opinion of Nitin, a second reviewer, is also very much appreciated.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Robert Haas
Date:
On Tue, Mar 2, 2021 at 3:26 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> I don't know what committers will say, but I think that "ALTER TABLE" might be
> the essential thing for this patch to support, not "CREATE".  (This is similar
> to ALTER..SET STATISTICS, which is not allowed in CREATE.)
>
> The reason is that ALTER is what's important for RANGE partitions, which need
> to be created dynamically (for example, to support time-series data
> continuously inserting data around 'now').  I assume it's sometimes also
> important for LIST.  I think this patch should handle those cases better before
> being commited, or else we risk implementing grammar and other user-facing interface
> that fails to handle what's needed into the future (or that's non-essential).
> Even if dynamic creation isn't implemented yet, it seems important to at least
> implement the foundation for setting the configuration to *allow* that in the
> future, in a manner that's consistent with the initial implementation for
> "static" partitions.

I don't think it's a hard requirement, but it's an interesting point.
My initial reactions to the patch are:

- I don't think it's a very good idea to support LIST and HASH but not
RANGE. We need a design that can work for all three partitioning
strategies, even if we don't have support for all of them in the
initial patch. If they CAN all be in the same patch, so much the
better.

- I am not very impressed with the syntax. CONFIGURATION is an odd
word that seems too generic for what we're talking about here. It
would be tempting to use a connecting word like WITH or USING except
that both would be ambiguous here, so we can't. MySQL and Oracle use
the keyword PARTITIONS -- which I realize isn't a keyword at all in
PostgreSQL right now -- to introduce the partition specification. DB2
uses no keyword at all; it seems you just say PARTITION BY
(mypartitioncol) (...partition specifications go here...). I think
either approach could work for us. Avoiding the extra keyword is a
plus, especially since I doubt we're likely to support the exact
syntax that Oracle and MySQL offer anyway - though if we do, then I'd
be in favor of inserting the PARTITIONS keyword so that people's SQL
can work without modification.

- We need to think a little bit about exactly what we're trying to do.
The simplest imaginable thing here would be to just give people a
place to put a bunch of partition specifications. So you can imagine
letting someone say PARTITION BY HASH (FOR VALUES WITH (MODULUS 2,
REMAINDER 0), FOR VALUES WITH (MODULUS 2, REMAINDER 1)). However, the
patch quite rightly rejects that approach in favor of the theory that,
at CREATE TABLE time, you're just going to want to give a modulus and
have the system create one partition for every possible remainder. But
that could be expressed even more compactly than what the patch does.
Instead of saying PARTITION BY HASH CONFIGURATION (MODULUS 4) we could
just let people say PARTITION BY HASH (4) or probably even PARTITION
BY HASH 4.

- For list partitioning, the patch falls back to just letting you put
a bunch of VALUES IN clauses in the CREATE TABLE statement. I don't
find something like PARTITION BY LIST CONFIGURATION (VALUES IN (1, 2),
(1, 3)) to be particularly readable. What are all the extra keywords
adding? We could just say PARTITION BY LIST ((1, 2), (1, 3)). I think
I would find that easier to remember; not sure what other people
think. As an alternative, PARTITION BY LIST VALUES IN (1, 2), (1, 3)
looks workable, too.

- What about range partitioning? This is an interesting case because
while in theory you could leave gaps between range partitions, in
practice people probably don't want to do that very often, and it
might be better to have a simpler syntax that caters to the common
case, since people can always create partitions individually if they
happen to want gaps. So you can imagine making something like
PARTITION BY RANGE ((MINVALUE), (42), (163)) mean create two
partitions, one from (MINVALUE) to (42) and the other from (42) to
(163). I think that would be pretty useful.

- Another possible separating keyword here would be INITIALLY, which
is already a parser keyword. So then you could have stuff like
PARTITION BY HASH INITIALLY 4, PARTITION BY LIST INITIALLY ((1, 2),
(1, 3)), PARTITION BY RANGE INITIALLY ((MINVALUE), (42), (163)).

- The patch doesn't document the naming convention for the
automatically created partitions, and it is worth thinking a bit about
how that is going to work. Do people want to be able to specify the
name of the partitioned table when they are using this syntax, or are
they happy with automatically generated names? If the latter, are they
happy with THESE automatically generated names? I guess for HASH
appending _%d where %d is the modulus is fine, but it is not necessary
so great for LIST. If I said CREATE TABLE foo ... PARTITION BY LIST
(('en'), ('ru'), ('jp')) I think I'd be hoping to end up with
partitions named foo_en, foo_ru, and foo_jp rather than foo_0, foo_1,
foo_2. Or maybe I'd rather say PARTITION BY LIST (foo_en ('en'),
foo_ru ('ru'), foo_jp ('jp')) or something like that to be explicit
about it. Not sure. But it's worth some thought. I think this comes
into focus even more clearly for range partitions, where you probably
want the partitions to follow a convention like basetablename_yyyy_mm.

- The documentation for the CONFIGURATION option doesn't match the
grammar. The documentation makes it an independent clause, so
CONFIGURATION could be specified even if PARTITION BY is not. But the
implementation makes the better choice to treat CONFIGURATION as a
further specification of PARTITION BY.

- I don't think this patch is really all that close to being ready for
committer. Beyond the design issues which seem to need more thought,
there's stuff in the patch like:

+ elog(DEBUG1,"stransformPartitionAutoCreate HASH i %d MODULUS %d \n %s\n",
+ i, bound->modulus, nodeToString(part));

Now, on the one hand, debugging elogs like this have little business
in a final patch. And, on the other hand, if we were going to include
them in the final patch, we'd probably want to at least spell the
function name correctly. Similarly, it's evident that this test case
has not been carefully reviewed by anyone, including the author:

+REATE TABLE fail_parted (a int) PARTITION BY HASH (a) CONFIGURATION
+(MODULUS 10 DEFAULT PARTITION hash_default);

Not too surprisingly, the system isn't familiar with the REATE command.

- There's some questionable error-reporting behavior in here, too, particularly:

+CREATE TABLE fail_parted (a int) PARTITION BY LIST (a) CONFIGURATION
+(values in (1, 2), (1, 3));
+ERROR:  partition "fail_parted_1" would overlap partition "fail_parted_0"
+LINE 2: (values in (1, 2), (1, 3));

Since the user hasn't provided the names fail_parted_0 or
fail_parted_1, it kind of stinks to use them in an error report. The
error cursor is good, but I wonder if we need to do better. One option
would be to go to a syntax where the user specifies the partition
names explicitly, which then justifies using that name in an error
report. Another possibility would be to give a different message in
this case, like:

ERROR: partition for values in (1, 2) would overlap partition for
values in (1, 3)

Now, that would require a pretty substantial redesign of the patch,
and I'm not sure it's worth the effort. But I'm also not sure that it
isn't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: [PATCH] Automatic HASH and LIST partition creation

From
Pavel Borisov
Date:
- I don't think it's a very good idea to support LIST and HASH but not
RANGE. We need a design that can work for all three partitioning
strategies, even if we don't have support for all of them in the
initial patch. If they CAN all be in the same patch, so much the
better.

- I am not very impressed with the syntax. CONFIGURATION is an odd
word that seems too generic for what we're talking about here. It
would be tempting to use a connecting word like WITH or USING except
that both would be ambiguous here, so we can't. MySQL and Oracle use
the keyword PARTITIONS -- which I realize isn't a keyword at all in
PostgreSQL right now -- to introduce the partition specification. DB2
uses no keyword at all; it seems you just say PARTITION BY
(mypartitioncol) (...partition specifications go here...). I think
either approach could work for us. Avoiding the extra keyword is a
plus, especially since I doubt we're likely to support the exact
syntax that Oracle and MySQL offer anyway - though if we do, then I'd
be in favor of inserting the PARTITIONS keyword so that people's SQL
can work without modification.

- We need to think a little bit about exactly what we're trying to do.
The simplest imaginable thing here would be to just give people a
place to put a bunch of partition specifications. So you can imagine
letting someone say PARTITION BY HASH (FOR VALUES WITH (MODULUS 2,
REMAINDER 0), FOR VALUES WITH (MODULUS 2, REMAINDER 1)). However, the
patch quite rightly rejects that approach in favor of the theory that,
at CREATE TABLE time, you're just going to want to give a modulus and
have the system create one partition for every possible remainder. But
that could be expressed even more compactly than what the patch does.
Instead of saying PARTITION BY HASH CONFIGURATION (MODULUS 4) we could
just let people say PARTITION BY HASH (4) or probably even PARTITION
BY HASH 4.

- For list partitioning, the patch falls back to just letting you put
a bunch of VALUES IN clauses in the CREATE TABLE statement. I don't
find something like PARTITION BY LIST CONFIGURATION (VALUES IN (1, 2),
(1, 3)) to be particularly readable. What are all the extra keywords
adding? We could just say PARTITION BY LIST ((1, 2), (1, 3)). I think
I would find that easier to remember; not sure what other people
think. As an alternative, PARTITION BY LIST VALUES IN (1, 2), (1, 3)
looks workable, too.

- What about range partitioning? This is an interesting case because
while in theory you could leave gaps between range partitions, in
practice people probably don't want to do that very often, and it
might be better to have a simpler syntax that caters to the common
case, since people can always create partitions individually if they
happen to want gaps. So you can imagine making something like
PARTITION BY RANGE ((MINVALUE), (42), (163)) mean create two
partitions, one from (MINVALUE) to (42) and the other from (42) to
(163). I think that would be pretty useful.

- Another possible separating keyword here would be INITIALLY, which
is already a parser keyword. So then you could have stuff like
PARTITION BY HASH INITIALLY 4, PARTITION BY LIST INITIALLY ((1, 2),
(1, 3)), PARTITION BY RANGE INITIALLY ((MINVALUE), (42), (163)).

Robert, I've read your considerations and I have a proposal to change the syntax to make it like:

CREATE TABLE foo (bar text) PARTITION BY LIST (bar) PARTITIONS (('US'), ('UK', 'RU'));
CREATE TABLE foo (bar text) PARTITION BY LIST (bar) PARTITIONS (foo_us('US'), foo_uk_ru('UK', 'RU'), { DEFAULT foo_dflt | AUTOMATIC });

CREATE TABLE foo (bar int) PARTITION BY HASH (bar) PARTITIONS (5);

CREATE TABLE foo (bar int) PARTITION BY RANGE (bar) PARTITIONS (FROM 1 TO 10 INTERVAL 2, { DEFAULT foo_dflt | AUTOMATIC }); 

- I think using partitions syntax without any keyword at all, is quite different from the existing pseudo-english PostgreSQL syntax. Also, it will need two consecutive brackets divided by nothing (<partitioning key>)(<partitions configuration>). So I think it's better to use the keyword PARTITIONS

- from the current patch it seems like a 'syntactic sugar' only but I don't think it is being so. From a new syntaх proposal it's seen that it can enable three options 
(1) create a fixed set of partitions with everything else comes to the default partition 
(2) create a fixed set of partitions with everything else invokes error on insert 
(3) create a set of partitions with everything else invokes a new partition creation based on a partition key (AUTOMATIC word). Like someone will be able to do:
CREATE TABLE foo (a varchar) PARTITION BY LIST (SUBSTRING (a, 1, 1)) PARTITIONS (('a'),('b'),('c'));
INSERT INTO foo VALUES ("doctor"); // will automatically create partition for 'd'
INSERT INTO foo VALUES ("dam"); // will come into partition 'd' 

Option (3) is not yet implemented and sure it needs much care from DBA to not end up with the each-row-separate-partition. 

- Also with option (3) and AUTOMATIC word someone will be able to do:
CREATE TABLE foo (a timestamp, t text) PARTITION BY LIST(EXTRACT (YEAR FROM a)) PARTITIONS (('1982'),('1983'),('1984'));
INSERT INTO foo VALUES (TIMESTAMP '1986-01-01 13:30:03', 'Orwell'); // creates '1986' partition and inserts into it
I think this option will be very useful as partitioning based on regular intervals of time I think is quite natural and often used. And to do it we don't need to implement arbitrary intervals (partition by range). But I think it's also worth implementing (proposed syntax for RANGE see above);

- As for the naming of partitions I've seen what is done in Oracle: partition names can be provided when you create an initial set, and when a partition is created automatically on insert it will get some illegible name chosen by the system (it even doesn't include parent table prefix). I'd propose to implement: 
(1) If partition name is not specified it has format <parent_table_name>_<value_of_partition_key> 
where <value_of_partition_key> is a remainder in HASH, the first element of the list of values for the partition in LIST case, left range-bound in RANGE case 
(2) If it is specified (not possible at partition creation at insert command) it is <parent_table_name>_<specified_name>
Though we'll probably need to have some rules for the abbreviation for partition name should not exceed the relation name length limit. I think partitions naming with plain _numbers in the existing patch is for the purpose of increasing relation name length as little as possible for not implementing abbreviation.

What do you think will the described approach lead to a useful patch? Should it be done as a whole or it's possible to commit it in smaller steps? (E.g. first part without AUTOMATIC capability, then add AUTOMATIC capability. Or with some other order of features implementation)

My own view is that if some implementation of syntax is solidly decided, it will promote work on more complicated logic of the patch and implement all parts one-by-one for the feature finally become really usable (not just helping to squash several SQL commands into one as this patch does). I see the existing patch as the starting point of the whole work and given some decisions on syntax I can try to rework and extend it accordingly. 

Overall I consider this useful for PostgreSQL.

What do you think about it?

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Automatic HASH and LIST partition creation

From
Robert Haas
Date:
On Wed, Jul 14, 2021 at 7:28 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> What do you think will the described approach lead to a useful patch? Should it be done as a whole or it's possible
tocommit it in smaller steps? (E.g. first part without AUTOMATIC capability, then add AUTOMATIC capability. Or with
someother order of features implementation) 

I would suggest that you consider on-the-fly partition creation to be
a completely separate feature from initial partition creation as part
of CREATE TABLE. I think you can have either without the other, and I
think the latter is a lot easier than the former. I doubt that
on-the-fly partition creation makes any sense at all for hash
partitions; there seems to be no reason not to pre-create all the
partitions. It's pretty straightforward to see how it should work for
LIST, but RANGE needs an interval or something to be stored in the
system catalogs so you can figure out where to put the boundaries, and
somehow you've got to identify a + operator for the relevant data
type. Tom Lane probably won't be thrilled if you suggest looking it up
based on the operator NAME. The bigger issue IMHO with on-the-fly
partition creation is avoiding deadlocks in the presence of current
inserters; I submit that without at least some kind of attempt to
avoid deadlocks and spurious errors there, it's not really a usable
scheme, and that seems hard.

On the other hand, modulo syntax details, creating partitions at
CREATE TABLE time seems relatively simple and, especially in the case
of hash partitioning, useful.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: [PATCH] Automatic HASH and LIST partition creation

From
Justin Pryzby
Date:
On Tue, Jul 20, 2021 at 02:42:16PM -0400, Robert Haas wrote:
> The bigger issue IMHO with on-the-fly
> partition creation is avoiding deadlocks in the presence of current
> inserters; I submit that without at least some kind of attempt to
> avoid deadlocks and spurious errors there, it's not really a usable
> scheme, and that seems hard.

I was thinking that for dynamic creation, there would be a DDL command to
create the necessary partitions:

-- Creates 2021-01-02, unless the month already exists:
ALTER TABLE bydate SET GRANULARITY='1day';
ALTER TABLE bydate CREATE PARTITION FOR VALUE ('2021-01-02');

I'd want it to support changing the granularity of the range partitions:

-- Creates 2021-01 unless the month already exists.
-- Errors if a day partition already exists which would overlap?
ALTER TABLE bydate SET granularity='1month';
ALTER TABLE bydate CREATE PARTITION FOR VALUE ('2021-01-03');

It could support creating ranges, which might create multiple partitions,
depending on the granularity:

ALTER TABLE bydate CREATE PARTITION FOR VALUES ('2021-01-01') TO ('2021-02-01')

Or the catalog could include not only granularity, but also endpoints:

ALTER TABLE bydate SET ENDPOINTS ('2012-01-01') ('2022-01-01')
ALTER TABLE bydate CREATE PARTITIONS; --create anything needed to fill from a->b
ALTER TABLE bydate PRUNE PARTITIONS; --drop anything outside of [a,b]

I would use this to set "fine" granularity for large tables, and "course"
granularity for tables that were previously set to "fine" granularity, but its
partitions are no longer large enough to justify it.  This logic currently
exists in our application - we create partitions dynamically immediately before
inserting.  But it'd be nicer if it were created asynchronously.  It may create
tables which were never inserted into, which is fine - they'd be course
granularity tables (one per month).

I think this might elegantly allow both 1) subpartitioning; 2) repartitioning
to a different granularity (for which I currently have my own tool).

-- 
Justin



Re: [PATCH] Automatic HASH and LIST partition creation

From
Robert Haas
Date:
On Tue, Jul 20, 2021 at 3:13 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> On Tue, Jul 20, 2021 at 02:42:16PM -0400, Robert Haas wrote:
> > The bigger issue IMHO with on-the-fly
> > partition creation is avoiding deadlocks in the presence of current
> > inserters; I submit that without at least some kind of attempt to
> > avoid deadlocks and spurious errors there, it's not really a usable
> > scheme, and that seems hard.
>
> I was thinking that for dynamic creation, there would be a DDL command to
> create the necessary partitions:
>
> -- Creates 2021-01-02, unless the month already exists:
> ALTER TABLE bydate SET GRANULARITY='1day';
> ALTER TABLE bydate CREATE PARTITION FOR VALUE ('2021-01-02');

Well, that dodges the deadlock issue with doing it implicitly, but it
also doesn't seem to offer a lot of value over just creating the
partitions in a fully manual way. I mean you could just say:

CREATE TABLE bydate_2021_02_02 PARTITION OF bydate FOR VALUES FROM
('2021-01-02') TO ('2021-02-03');

It's longer, but it's not really that bad.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: [PATCH] Automatic HASH and LIST partition creation

From
Daniel Gustafsson
Date:
This thread has stalled since July with review comments unanswered, I'm marking
the patch Returned with Feedback.  Please feel free to resubmit when/if a new
patch is available.

--
Daniel Gustafsson        https://vmware.com/




Re: [PATCH] Automatic HASH and LIST partition creation

From
Stéphane Tachoires
Date:
Hi,
I found that thread (and the patch), but it seems to be pretty dead.
Patch didn't apply, due to gen_node_support.pl
Can I hope for a rebirth ?

I've made a rebased patch,in case of no response...

Perhaps it's too early for a commit ; automatic range partitioning is still missing and, according to https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements, syntax is arguable.

If 'USING' it out of option (already a keyword for CREATE TABLE) and 'CONFIGURATION()' is not what we want, we should reach for a final decision first.
I suggest OVER that is a keyword but unused in CREATE TABLE (nor ALTER TABLE). Whatever...

For RANGE partitioning I think of four syntaxes (inspired by pg_partman)
PARTITION BY RANGE(stamp) CONFIGURATION (SPAN interval CENTER datetime BACK integer AHEAD integer [DEFAULT [PARTITION] [defname]])
PARTITION BY RANGE(stamp) CONFIGURATION (SPAN interval START firstfrombound END lasttobound [DEFAULT [PARTITION] [defname]])
PARTITION BY RANGE(region_id) CONFIGURATION (STEP integer START integer END integer [DEFAULT [PARTITION] [defname]])
PARTITION BY RANGE(name) CONFIGURATION (BOUNDS (boundlist) [START firstfrombound] [END lasttobound] [DEFAULT [PARTITION] [defname]])

Last one should solve the addition operator problem with non numeric non timedate range.
Plus, it allows non uniform range (thinking about an "encyclopedia volume" partitioning, you know 'A', 'B-CL', 'CL-D'...)

CREATE table (LIKE other INCLUDING PARTITIONS) should create 'table' partitioned the same as 'other'
and
CREATE table (LIKE other INCLUDING PARTITIONS) PARTITION BY partspec CONFIGURATION(), should create 'table' partitioned by partspec and sub partitioned as 'other'.

Then CREATE could accept multiple PARTITION BY CONFIGURATION().

For ALTER TABLE (and automatic maintenance) to be usable, we will need SPLIT and MERGE CONCURRENTLY (pg_pathman ?) enhanced by CREATE TABLE LIKE to handle subpartitioning. But that's another story.

Stéphane.


Le jeu. 2 déc. 2021 à 12:20, Daniel Gustafsson <daniel@yesql.se> a écrit :
This thread has stalled since July with review comments unanswered, I'm marking
the patch Returned with Feedback.  Please feel free to resubmit when/if a new
patch is available.

--
Daniel Gustafsson               https://vmware.com/





--
"Où se posaient les hirondelles avant l'invention du téléphone ?"
  -- Grégoire Lacroix
Attachment