Home > mailing lists

Re: [PATCH] Automatic HASH and LIST partition creation - Mailing list pgsql-hackers

From	Pavel Borisov
Subject	Re: [PATCH] Automatic HASH and LIST partition creation
Date	July 14, 2021 14:28:05
Msg-id	CALT9ZEEA-_1LthvsKAy0bNnyhzUm1g3pE=nk708_pX8DsfQQ8g@mail.gmail.com Whole thread Raw
In response to	Re: [PATCH] Automatic HASH and LIST partition creation (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: [PATCH] Automatic HASH and LIST partition creation (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

- I don't think it's a very good idea to support LIST and HASH but not
RANGE. We need a design that can work for all three partitioning
strategies, even if we don't have support for all of them in the
initial patch. If they CAN all be in the same patch, so much the
better.

- I am not very impressed with the syntax. CONFIGURATION is an odd
word that seems too generic for what we're talking about here. It
would be tempting to use a connecting word like WITH or USING except
that both would be ambiguous here, so we can't. MySQL and Oracle use
the keyword PARTITIONS -- which I realize isn't a keyword at all in
PostgreSQL right now -- to introduce the partition specification. DB2
uses no keyword at all; it seems you just say PARTITION BY
(mypartitioncol) (...partition specifications go here...). I think
either approach could work for us. Avoiding the extra keyword is a
plus, especially since I doubt we're likely to support the exact
syntax that Oracle and MySQL offer anyway - though if we do, then I'd
be in favor of inserting the PARTITIONS keyword so that people's SQL
can work without modification.

- We need to think a little bit about exactly what we're trying to do.
The simplest imaginable thing here would be to just give people a
place to put a bunch of partition specifications. So you can imagine
letting someone say PARTITION BY HASH (FOR VALUES WITH (MODULUS 2,
REMAINDER 0), FOR VALUES WITH (MODULUS 2, REMAINDER 1)). However, the
patch quite rightly rejects that approach in favor of the theory that,
at CREATE TABLE time, you're just going to want to give a modulus and
have the system create one partition for every possible remainder. But
that could be expressed even more compactly than what the patch does.
Instead of saying PARTITION BY HASH CONFIGURATION (MODULUS 4) we could
just let people say PARTITION BY HASH (4) or probably even PARTITION
BY HASH 4.

- For list partitioning, the patch falls back to just letting you put
a bunch of VALUES IN clauses in the CREATE TABLE statement. I don't
find something like PARTITION BY LIST CONFIGURATION (VALUES IN (1, 2),
(1, 3)) to be particularly readable. What are all the extra keywords
adding? We could just say PARTITION BY LIST ((1, 2), (1, 3)). I think
I would find that easier to remember; not sure what other people
think. As an alternative, PARTITION BY LIST VALUES IN (1, 2), (1, 3)
looks workable, too.

- What about range partitioning? This is an interesting case because
while in theory you could leave gaps between range partitions, in
practice people probably don't want to do that very often, and it
might be better to have a simpler syntax that caters to the common
case, since people can always create partitions individually if they
happen to want gaps. So you can imagine making something like
PARTITION BY RANGE ((MINVALUE), (42), (163)) mean create two
partitions, one from (MINVALUE) to (42) and the other from (42) to
(163). I think that would be pretty useful.

- Another possible separating keyword here would be INITIALLY, which
is already a parser keyword. So then you could have stuff like
PARTITION BY HASH INITIALLY 4, PARTITION BY LIST INITIALLY ((1, 2),
(1, 3)), PARTITION BY RANGE INITIALLY ((MINVALUE), (42), (163)).

Robert, I've read your considerations and I have a proposal to change the syntax to make it like:

CREATE TABLE foo (bar text) PARTITION BY LIST (bar) PARTITIONS (('US'), ('UK', 'RU'));
CREATE TABLE foo (bar text) PARTITION BY LIST (bar) PARTITIONS (foo_us('US'), foo_uk_ru('UK', 'RU'), { DEFAULT foo_dflt | AUTOMATIC });

CREATE TABLE foo (bar int) PARTITION BY HASH (bar) PARTITIONS (5);

CREATE TABLE foo (bar int) PARTITION BY RANGE (bar) PARTITIONS (FROM 1 TO 10 INTERVAL 2, { DEFAULT foo_dflt | AUTOMATIC });

- I think using partitions syntax without any keyword at all, is quite different from the existing pseudo-english PostgreSQL syntax. Also, it will need two consecutive brackets divided by nothing (<partitioning key>)(<partitions configuration>). So I think it's better to use the keyword PARTITIONS

- from the current patch it seems like a 'syntactic sugar' only but I don't think it is being so. From a new syntaх proposal it's seen that it can enable three options

(1) create a fixed set of partitions with everything else comes to the default partition

(2) create a fixed set of partitions with everything else invokes error on insert

(3) create a set of partitions with everything else invokes a new partition creation based on a partition key (AUTOMATIC word). Like someone will be able to do:

CREATE TABLE foo (a varchar) PARTITION BY LIST (SUBSTRING (a, 1, 1)) PARTITIONS (('a'),('b'),('c'));

INSERT INTO foo VALUES ("doctor"); // will automatically create partition for 'd'

INSERT INTO foo VALUES ("dam"); // will come into partition 'd'

Option (3) is not yet implemented and sure it needs much care from DBA to not end up with the each-row-separate-partition.

- Also with option (3) and AUTOMATIC word someone will be able to do:

CREATE TABLE foo (a timestamp, t text) PARTITION BY LIST(EXTRACT (YEAR FROM a)) PARTITIONS (('1982'),('1983'),('1984'));

INSERT INTO foo VALUES (TIMESTAMP '1986-01-01 13:30:03', 'Orwell'); // creates '1986' partition and inserts into it

I think this option will be very useful as partitioning based on regular intervals of time I think is quite natural and often used. And to do it we don't need to implement arbitrary intervals (partition by range). But I think it's also worth implementing (proposed syntax for RANGE see above);

- As for the naming of partitions I've seen what is done in Oracle: partition names can be provided when you create an initial set, and when a partition is created automatically on insert it will get some illegible name chosen by the system (it even doesn't include parent table prefix). I'd propose to implement:

(1) If partition name is not specified it has format <parent_table_name>_<value_of_partition_key>

where <value_of_partition_key> is a remainder in HASH, the first element of the list of values for the partition in LIST case, left range-bound in RANGE case

(2) If it is specified (not possible at partition creation at insert command) it is <parent_table_name>_<specified_name>

Though we'll probably need to have some rules for the abbreviation for partition name should not exceed the relation name length limit. I think partitions naming with plain _numbers in the existing patch is for the purpose of increasing relation name length as little as possible for not implementing abbreviation.

What do you think will the described approach lead to a useful patch? Should it be done as a whole or it's possible to commit it in smaller steps? (E.g. first part without AUTOMATIC capability, then add AUTOMATIC capability. Or with some other order of features implementation)

My own view is that if some implementation of syntax is solidly decided, it will promote work on more complicated logic of the patch and implement all parts one-by-one for the feature finally become really usable (not just helping to squash several SQL commands into one as this patch does). I see the existing patch as the starting point of the whole work and given some decisions on syntax I can try to rework and extend it accordingly.

Overall I consider this useful for PostgreSQL.

What do you think about it?

Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

pgsql-hackers by date:

From: Greg Nancarrow
Date: 14 July 2021, 14:21:36
Subject: Re: row filtering for logical replication

From: vignesh C
Date: 14 July 2021, 14:34:01
Subject: Re: partial heap only tuples

Re: [PATCH] Automatic HASH and LIST partition creation - Mailing list pgsql-hackers

Previous

Next