Thread: Proposal for syntax to support creation of partition tables whencreating parent table
Proposal for syntax to support creation of partition tables whencreating parent table
From
Muhammad Usama
Date:
Hi Hackers,
CREATE TABLE table_name (..)
PARTITION BY { RANGE | LIST | HASH } (..)
(
list of partitions
) ;
CREATE TABLE All_Sales ( year INT, month INT, day INT, info TEXT)
PARTITION BY RANGE(year)(
PARTITION sale_2019_2020 FOR VALUES FROM (2019) TO (2021)
PARTITION BY LIST(month)
(
PARTITION sale_2019_2020_1 FOR VALUES IN (1,2,3,4)
PARTITION BY RANGE(day)(
PARTITION sale_2019_2020_1_1 FOR VALUES FROM (1) TO (10)
PARTITION BY HASH(info)
(
PARTITION sale_2019_2020_1_1_1 FOR VALUES WITH (MODULUS 2,REMAINDER 0),
PARTITION sale_2019_2020_1_1_2 FOR VALUES WITH (MODULUS 2,REMAINDER 1)
),
PARTITION sale_2019_2020_1_2 FOR VALUES FROM (10) TO (20),
PARTITION sale_2019_2020_1_3 FOR VALUES FROM (20) TO (32)),
PARTITION sale_2019_2020_2 FOR VALUES IN (5,6,7,8),
PARTITION sale_2019_2020_3 FOR VALUES IN (9,10,11,12)
),
PARTITION sale_2021_2022 FOR VALUES FROM (2021) TO (2023),
PARTITION sale_2023_2024 FOR VALUES FROM (2023) TO (2025),
PARTITION sale_default default
I want to propose an extension to CREATE TABLE syntax to allow the creation of partition tables along with its parent table using a single statement.
In this proposal, I am proposing to specify the list of partitioned tables after the PARTITION BY clause.
PARTITION BY { RANGE | LIST | HASH } (..)
(
list of partitions
) ;
Below are a few examples of the proposed syntax, in a nutshell, I am leveraging the syntax currently supported by Postgres for creating partitioned tables. The purpose of this proposal is to combine the creation of the parent partition table and its partitions in one SQL statement.
CREATE TABLE Sales (salesman_id INT, salesman_name TEXT, sales_region TEXT, hiring_date DATE, sales_amount INT )
PARTITION BY RANGE (hiring_date)
(
PARTITION part_one FOR VALUES FROM ('2008-02-01') TO ('2008-03-01'),
PARTITION part_two FOR VALUES FROM ('2009-02-01') TO ('2009-03-01'),
PARTITION part_def DEFAULT
);
PARTITION BY RANGE (hiring_date)
(
PARTITION part_one FOR VALUES FROM ('2008-02-01') TO ('2008-03-01'),
PARTITION part_two FOR VALUES FROM ('2009-02-01') TO ('2009-03-01'),
PARTITION part_def DEFAULT
);
CREATE TABLE Sales2 (salesman_id INT, salesman_name TEXT, sales_region TEXT, hiring_date DATE, sales_amount INT )
PARTITION BY HASH (salesman_id)
(
PARTITION par_one FOR VALUES WITH (MODULUS 2, REMAINDER 0),
PARTITION par_two FOR VALUES WITH (MODULUS 2, REMAINDER 1)
PARTITION BY HASH (salesman_id)
(
PARTITION par_one FOR VALUES WITH (MODULUS 2, REMAINDER 0),
PARTITION par_two FOR VALUES WITH (MODULUS 2, REMAINDER 1)
);
CREATE TABLE Sales3(salesman_id INT, salesman_name TEXT, sales_region TEXT, hiring_date DATE, sales_amount INT)
PARTITION BY LIST (sales_region)
(
PARTITION pt_one FOR VALUES IN ('JAPAN','CHINA'),
PARTITION pt_two FOR VALUES IN ('USA','CANADA'),
PARTITION pt_def DEFAULT
);
(
PARTITION pt_one FOR VALUES IN ('JAPAN','CHINA'),
PARTITION pt_two FOR VALUES IN ('USA','CANADA'),
PARTITION pt_def DEFAULT
);
-- Similarly for specifying subpartitions of partitioned tables
PARTITION BY RANGE(year)(
PARTITION sale_2019_2020 FOR VALUES FROM (2019) TO (2021)
PARTITION BY LIST(month)
(
PARTITION sale_2019_2020_1 FOR VALUES IN (1,2,3,4)
PARTITION BY RANGE(day)(
PARTITION sale_2019_2020_1_1 FOR VALUES FROM (1) TO (10)
PARTITION BY HASH(info)
(
PARTITION sale_2019_2020_1_1_1 FOR VALUES WITH (MODULUS 2,REMAINDER 0),
PARTITION sale_2019_2020_1_1_2 FOR VALUES WITH (MODULUS 2,REMAINDER 1)
),
PARTITION sale_2019_2020_1_2 FOR VALUES FROM (10) TO (20),
PARTITION sale_2019_2020_1_3 FOR VALUES FROM (20) TO (32)),
PARTITION sale_2019_2020_2 FOR VALUES IN (5,6,7,8),
PARTITION sale_2019_2020_3 FOR VALUES IN (9,10,11,12)
),
PARTITION sale_2021_2022 FOR VALUES FROM (2021) TO (2023),
PARTITION sale_2023_2024 FOR VALUES FROM (2023) TO (2025),
PARTITION sale_default default
);
This new syntax requires minimal changes in the code. I along with my colleague Movead.li have drafted a rough POC patch attached to this email.
Please note that the patch is just to showcase the new syntax and get a consensus on the overall design and approach.
As far as I know, there are already few ongoing discussions related to the partition syntax enhancements, but the proposed syntax will not interfere with these ongoing proposals. Here is a link to one such discussion:
Please feel free to share your thoughts.
Best Regards
...
Muhammad Usama
Highgo Software Canada
ADDR: 10318 WHALLEY BLVD, Surrey, BC
Attachment
Re: Proposal for syntax to support creation of partition tables whencreating parent table
From
Fabien COELHO
Date:
Hello Muhammad, I think that it may be better to have a partition spec which describes not the list of partitions, but what is wanted, letting postgres to do some more work. See this thread: https://www.postgresql.org/message-id/alpine.DEB.2.21.1907150711080.22273@lancre > I want to propose an extension to CREATE TABLE syntax to allow the creation > of partition tables along with its parent table using a single statement. > > In this proposal, I am proposing to specify the list of partitioned tables > after the PARTITION BY clause. > > CREATE TABLE table_name (..) > PARTITION BY { RANGE | LIST | HASH } (..) > ( > list of partitions > ) ; > Below are a few examples of the proposed syntax, in a nutshell, I am > leveraging the syntax currently supported by Postgres for creating > partitioned tables. The purpose of this proposal is to combine the creation > of the parent partition table and its partitions in one SQL statement. > > CREATE TABLE Sales (salesman_id INT, salesman_name TEXT, sales_region TEXT, > hiring_date DATE, sales_amount INT ) > PARTITION BY RANGE (hiring_date) > ( > PARTITION part_one FOR VALUES FROM ('2008-02-01') TO ('2008-03-01'), > PARTITION part_two FOR VALUES FROM ('2009-02-01') TO ('2009-03-01'), > PARTITION part_def DEFAULT > ); > > CREATE TABLE Sales2 (salesman_id INT, salesman_name TEXT, sales_region > TEXT, hiring_date DATE, sales_amount INT ) > PARTITION BY HASH (salesman_id) > ( > PARTITION par_one FOR VALUES WITH (MODULUS 2, REMAINDER 0), > PARTITION par_two FOR VALUES WITH (MODULUS 2, REMAINDER 1) > ); > > CREATE TABLE Sales3(salesman_id INT, salesman_name TEXT, sales_region TEXT, > hiring_date DATE, sales_amount INT) > PARTITION BY LIST (sales_region) > ( > PARTITION pt_one FOR VALUES IN ('JAPAN','CHINA'), > PARTITION pt_two FOR VALUES IN ('USA','CANADA'), > PARTITION pt_def DEFAULT > ); > > -- Similarly for specifying subpartitions of partitioned tables > > CREATE TABLE All_Sales ( year INT, month INT, day INT, info TEXT) > PARTITION BY RANGE(year)( > PARTITION sale_2019_2020 FOR VALUES FROM (2019) TO (2021) > PARTITION BY LIST(month) > ( > PARTITION sale_2019_2020_1 FOR VALUES IN (1,2,3,4) > PARTITION BY RANGE(day)( > PARTITION sale_2019_2020_1_1 FOR VALUES FROM (1) TO (10) > PARTITION BY HASH(info) > ( > PARTITION sale_2019_2020_1_1_1 FOR VALUES WITH (MODULUS > 2,REMAINDER 0), > PARTITION sale_2019_2020_1_1_2 FOR VALUES WITH (MODULUS > 2,REMAINDER 1) > ), > PARTITION sale_2019_2020_1_2 FOR VALUES FROM (10) TO (20), > PARTITION sale_2019_2020_1_3 FOR VALUES FROM (20) TO (32)), > PARTITION sale_2019_2020_2 FOR VALUES IN (5,6,7,8), > PARTITION sale_2019_2020_3 FOR VALUES IN (9,10,11,12) > ), > PARTITION sale_2021_2022 FOR VALUES FROM (2021) TO (2023), > PARTITION sale_2023_2024 FOR VALUES FROM (2023) TO (2025), > PARTITION sale_default default > ); > > This new syntax requires minimal changes in the code. I along with my > colleague Movead.li have drafted a rough POC patch attached to this email. > > Please note that the patch is just to showcase the new syntax and get a > consensus on the overall design and approach. > > As far as I know, there are already few ongoing discussions related to the > partition syntax enhancements, but the proposed syntax will not interfere > with these ongoing proposals. Here is a link to one such discussion: > https://www.postgresql.org/message-id/alpine.DEB.2.21.1907150711080.22273%40lancre > > Please feel free to share your thoughts. > > Best Regards > > ... > Muhammad Usama > Highgo Software Canada > URL : http://www.highgo.ca > ADDR: 10318 WHALLEY BLVD, Surrey, BC > -- Fabien Coelho - CRI, MINES ParisTech
Re: Proposal for syntax to support creation of partition tables when creating parent table
From
Tom Lane
Date:
Muhammad Usama <m.usama@gmail.com> writes: > I want to propose an extension to CREATE TABLE syntax to allow the creation > of partition tables along with its parent table using a single statement. TBH, I think this isn't a particularly good idea. It seems very reminiscent of the variant of CREATE SCHEMA that lets you create a bunch of contained objects along with the schema. That variant is a mess to support and AFAIK it's practically unused in the real world. (If it were used, we'd get requests to support more than the small number of object types that the CREATE SCHEMA grammar currently allows.) As Fabien noted, there's been some related discussion about this area, but nobody was advocating a solution of this particular shape. regards, tom lane
Re: Proposal for syntax to support creation of partition tables whencreating parent table
From
Ahsan Hadi
Date:
On Wed, Sep 25, 2019 at 8:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Muhammad Usama <m.usama@gmail.com> writes:
> I want to propose an extension to CREATE TABLE syntax to allow the creation
> of partition tables along with its parent table using a single statement.
TBH, I think this isn't a particularly good idea. It seems very
reminiscent of the variant of CREATE SCHEMA that lets you create
a bunch of contained objects along with the schema. That variant
is a mess to support and AFAIK it's practically unused in the
real world. (If it were used, we'd get requests to support more
than the small number of object types that the CREATE SCHEMA
grammar currently allows.)
IMO creating auto-partitions shouldn't be viewed as creating bunch of schema objects with CREATE SCHEMA command. Most of the other RDBMS solutions support the table partition syntax where parent partition table is specified with partitions and sub-partitions in same SQL statement. As I understand the proposal is not changing the syntax of creating partitions, it is providing the ease of creating parent partition table along with its partitions in same statement. I think it does make it easier when you are creating a big partition table with lots of partitions and sub-partitions.
The would also benefit users migrating to postgres from Oracle or mysql etc where similar syntax is supported.
And if not more I think it is a tick in the box with minimal code change.
As Fabien noted, there's been some related discussion about this
area, but nobody was advocating a solution of this particular shape.
The thread that Usama mentioned in his email is creating auto-partitions just for HASH partitions, this is trying to do similar for all types of partitions.
regards, tom lane
Highgo Software (Canada/China/Pakistan)
URL : http://www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
EMAIL: mailto: ahsan.hadi@highgo.ca
URL : http://www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
EMAIL: mailto: ahsan.hadi@highgo.ca
Re: Proposal for syntax to support creation of partition tables whencreating parent table
From
Amit Langote
Date:
Hi Ahsan, Usama Thanks for starting work on this. On Thu, Sep 26, 2019 at 3:46 AM Ahsan Hadi <ahsan.hadi@gmail.com> wrote: > On Wed, Sep 25, 2019 at 8:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> As Fabien noted, there's been some related discussion about this >> area, but nobody was advocating a solution of this particular shape. > > The thread that Usama mentioned in his email is creating auto-partitions just for HASH partitions, this is trying to dosimilar for all types of partitions. I agree that this proposal makes life easier for developers familiar with the partitioning syntax and features of other databases. However, it adds little functionality over what users can already do, even though today it takes multiple commands rather than just one. The problem is that the syntax proposed here is still verbose because users still have to spell out all the partition bounds by themselves. The focus of the other thread, as I understand it, is to implement the functionality to get the same thing done (create many partitions in one command) in much less verbose manner. Fabien started the discussion for hash partitioning because the interface for it seems straightforward -- just specify the number of partitions and that many partitions would get created without having to actually specify modulus/remainder for each. Since the underlying functionality wouldn't be too different for other partitioning methods, we would only have to come up with a suitable interface. Thanks, Amit