Home > mailing lists

Re: Declarative partitioning - Mailing list pgsql-hackers

From	Josh Berkus
Subject	Re: Declarative partitioning
Date	August 18, 2015 17:31:48
Msg-id	55D36BFA.7080004@agliodbs.com Whole thread Raw
In response to	Declarative partitioning (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses	Re: Declarative partitioning Re: Declarative partitioning
List	pgsql-hackers

Tree view

Amit,

> I would like propose $SUBJECT for this development cycle. Attached is a
> WIP patch that implements most if not all of what's described below. Some
> yet unaddressed parts are mentioned below, too. I'll add this to the CF-SEP.

First of all, wow!  Really happy to see this.

> 
> Syntax
> ======
> 
> 1. Creating a partitioned table
> 
> CREATE TABLE table_name
> PARTITION BY {RANGE|LIST}
> ON (column_list);
> 
> Where column_list consists of simple column names or expressions:
> 
> PARTITION BY LIST ON (name)
> PARTITION BY RANGE ON (year, month)
> 
> PARTITION BY LIST ON ((lower(left(name, 2)))
> PARTITION BY RANGE ON ((extract(year from d)), (extract(month from d)))

So far so good.  Have you given any thought as to how a user will
determine which partition corresponds to which values (for purposes of
dropping/maintaining a partition)?

Also, won't doing things like extract() for range partitions make it
much harder for you to develop the planner parts of this solution?

What about defining an interval instead, such as:

PARTITION BY RANGE USING ( interval ) ON ( column );

i.e.

PARTITION BY RANGE USING ( INTERVAL '1 month' ) ON ( submitted_date );
PARTITION BY RANGE USING ( 100000 ) ON ( user_id );

This would make it easy for you to construct range type values defining
the range of each partition, which would then make the planner work much
easier than calling a function would, no?

Or am I misunderstanding how you're using ranges here? It kind of seems
like you're still leaving specific range defintions up to the user,
which is (from my perspective) unsatisfactory (see below).

I'm assuming that all range partitions will be [ closed, open ) ranges.


> 2. Creating a partition of a partitioned table
> 
> CREATE TABLE table_name
> PARTITION OF partitioned_table_name
> FOR VALUES values_spec;
> 
> Where values_spec is:
> 
> listvalues: [IN] (val1, ...)
> 
> rangevalues: START (col1min, ... ) END (col1max, ... )
>                | START (col1min, ... )
>                | END (col1max, ... )

So, one thing I missed in here is anything about automated partitioning
of tables; that is, creating new partitions based on incoming data or a
simple statement which doesn't require knowledge of the partitioning
scheme.  It's possible (and entirely accceptable) that you're
considering automated partition creation outside of the scope of this
patch.  However, for range partitions, it would be *really* useful to
have this syntax:

CREATE NEXT PARTITION ON parent_table;

Which would just create the "next" partition based on whatever the range
partitoning scheme is, instead of requiring the user to calculate start
and end values which might or might not match the parent partitioning
scheme, and might leave gaps.  Also this would be useful for range
partitions:

CREATE PARTITION ON parent_table USING ( start_value );

... where start_value is the start range of the new partition.  Again,
easier for users to get correct.

Both of these require the idea of regular intervals for range
partitions, that is, on a table partitioned by month on a timestamptz
column, each partition will have the range [ month:1, nextmonth:1 ).
This is the most common use-case for range partitions (like, 95% of all
partitioning cases I've seen), so a new partitioning scheme ought to
address it.

While there are certainly users who desire the ability to define
arbitrary ranges for each range partition, these are by far the minority
and could be accomodated by a different path with more complex syntax.
Further, I'd wager that most users who want to define arbitrary ranges
for range partitions aren't going to be satisfied with the other
restrictions on declarative partitioning (e.g. same constraints, columns
for all partitions) and are going to use inheritance partitioning anyway.

> 5. Detach partition
> 
> ALTER TABLE partitioned_table
> DETACH PARTITION partition_name [USING table_name]
> 
> This removes partition_name as partition of partitioned_table. The table
> continues to exist with the same name or 'table_name', if specified.
> pg_class.relispartition is set to false for the table, so it behaves like
> a normal table.

What about DROPping partitions?  Do they need to be detached first?

> Creating index on parent is not allowed. They should be defined on (leaf)
> partitions. Because of this limitation, primary keys are not allowed on a
> partitioned table. Perhaps, we should be able to just create a dummy
> entry somewhere to represent an index on parent (which every partition
> then copies.) 

This would be preferable, yes.  Making users remember to manually create
indexes on each partition is undesirable.

> What should TRUNCATE on partitioned table do?

On the master table?  Truncate all individual partitions.  Do not drop
the partitions.

On a partitition?  Truncate just that partition.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

pgsql-hackers by date:

From: Peter Eisentraut
Date: 18 August 2015, 17:25:07
Subject: Re: allowing wal_level change at run time

From: Andres Freund
Date: 18 August 2015, 17:46:53
Subject: Re: allowing wal_level change at run time

Re: Declarative partitioning - Mailing list pgsql-hackers

Previous

Next