Re: Declarative partitioning - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Declarative partitioning
Date
Msg-id 57146A6C.2010801@lab.ntt.co.jp
Whole thread Raw
In response to Re: Declarative partitioning  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: Declarative partitioning  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
Hi Ashutosh,

On 2016/04/15 18:48, Ashutosh Bapat wrote:
> With the new set of patches, I am getting following warnings but no
> compilation failures. Patches apply smoothly.
> partition.c:1216:21: warning: variable ‘form’ set but not used
> [-Wunused-but-set-variable]
> partition.c:1637:20: warning: variable ‘form’ set but not used
> [-Wunused-but-set-variable]

Ah, will fix.

> For creating partition the documentation seems to suggest the syntax
> create table t1_p1 partition of t1 for values {start {0} end {100}
> exclusive;
> but following syntax works
> create table t1_p1 partition of t1 for values start (0) end (100) exclusive;

Hmm, I see the following in docs:

...
and partition_bound_spec is:

FOR VALUES { list_spec | range_spec }

...

list_spec in FOR VALUES is:

{ IN ( expression [, ...] ) }

range_spec in FOR VALUES is:

{ START (lower-bound) [ INCLUSIVE | EXCLUSIVE ] | END (upper-bound) [ INCLUSIVE | EXCLUSIVE ] | START (lower-bound) [
INCLUSIVE| EXCLUSIVE ] END (upper-bound) [
 
INCLUSIVE | EXCLUSIVE ] }

By the way, I see that you are always specifying "exclusive" for end bound
in your examples.  The specification is optional if that wasn't apparent.Default (ie, without explicit specification)
is[start, end).
 

>> I got rid of all the optimizer changes in the new version (except a line
>> or two).  I did that by switching to storing parent-child relationships in
>> pg_inherits so that all the existing optimizer code for inheritance sets
>> works unchanged.  Also, the implementation detail that required to put
>> only leaf partitions in the append list is also gone.  Previously no
>> storage was allocated for partitioned tables (either root or any of the
>> internal partitions), so it was being done that way.
>>
> 
> With this set of patches we are still flattening all the partitions.
> postgres=# create table t1 (a int, b int) partition by range(a);
> CREATE TABLE
> postgres=# create table t1_p1 partition of t1 for values start (0) end
> (100) exclusive partition by range(b);
> CREATE TABLE
> postgres=# create table t1_p1_p1 partition of t1_p1 for values start (0)
> end (100) exclusive;
> CREATE TABLE
> postgres=# explain verbose select * from t1;
>                                QUERY PLAN
> -------------------------------------------------------------------------
>  Append  (cost=0.00..32.60 rows=2262 width=8)
>    ->  Seq Scan on public.t1  (cost=0.00..0.00 rows=1 width=8)
>          Output: t1.a, t1.b
>    ->  Seq Scan on public.t1_p1  (cost=0.00..0.00 rows=1 width=8)
>          Output: t1_p1.a, t1_p1.b
>    ->  Seq Scan on public.t1_p1_p1  (cost=0.00..32.60 rows=2260 width=8)
>          Output: t1_p1_p1.a, t1_p1_p1.b
> (7 rows)
> Retaining the partition hierarchy would help to push-down join across
> partition hierarchy effectively.

Yes, it's still a flattened append (basically what optimizer currently
creates for arbitrary inheritance trees).  I didn't modify any of that yet.

>> Regarding 6, it seems to me that because Append does not have a associated
>> relid (like scan nodes have with scanrelid).  Maybe we need to either fix
>> Append or create some enhanced version of Append which would also support
>> dynamic pruning.
>>
> 
> Right, I think, Append might store the relid of the parent table as well as
> partitioning information at that level along-with the subplans.

For time being, I will leave this as yet unaddressed (I am thinking about
what is reasonable to do for this also considering Robert's comment).  Is
that OK?

> Some more comments
> Would it be better to declare PartitionDescData as
> {
> int nparts;
> PartitionInfo *partinfo; /* array of partition information structures. */
> }

I think that might be better.  Will do it the way you suggest.

> The new syntax allows CREATE TABLE to be specified as partition of an
> already partitioned table. Is it possible to do the same for CREATE FOREIGN
> TABLE? Or that's material for v2? Similarly for ATTACH PARTITION.

OK, I will address this in the next version.  One question though: should
foreign table be only allowed to be leaf partitions (ie, no PARTITION BY
clause in CREATE FOREIGN TABLE ... PARTITION OF)?  By the way, ATTACH
PARTITION only allows leaf partitions anyway.

Thanks a lot for the review!

Thanks,
Amt





pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Confusing comment in pg_upgrade in regards to VACUUM FREEZE
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Support for N synchronous standby servers - take 2