Re: Syntax for partitioning - Mailing list pgsql-hackers

From Itagaki Takahiro
Subject Re: Syntax for partitioning
Date
Msg-id 20091102155044.C6B9.52131E4D@oss.ntt.co.jp
Whole thread Raw
In response to Re: Syntax for partitioning  (Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Responses Re: Syntax for partitioning
List pgsql-hackers
Here are details of partitioning syntax.

-----------------
 Syntax overview
-----------------
Partitions are defined with 3 steps:
 1. Create a plain table as parent.
 2. Set a partition key to a table.
 3. Add a partition to a table which has a partition key.

i.e.,
    CREATE TABLE table (...) PARTITION BY { RANGE | LIST } ( key ) (...)
is just an abbreviated form of:
 1. CREATE TABLE table (...);
 2. ALTER TABLE table PARTITION BY { RANGE | LIST } ( key );
 3. CREATE PARTITION name ON table VALUES ...;

Currently RANGE and LIST partitions are supported.
No reserved keywords are required by the syntax, and that's why
ALTER TABLE ADD PARTITION cannot be used here instead of CREATE PARTITION.

HASH partitions are not supported, but we can use LIST paritions with
an expression key as incomplete HASH partitions:
    CREATE TABLE table (...) PARTITION BY LIST ( hashtext(attr) ) (...);
    SELECT * FROM table WHERE hashtext(attr) = hashtext('search_key');

------------------------------------------
 Features *NOT* included in this proposal
------------------------------------------
To simplify patch, the following features are not included:
 - Partition triggers to dispatch rows inserted into parent table
 - Expanding some commands for a parent to partitions (ex. VACUUM)
 - ALTER commands except RENAME (ex. MERGE, SPLIT, UPDATE)
 - Ability to add an existing table to a parent as a partition
    (ex. ALTER TABLE table INHERIT parent AS PARTITION)
 - Planner and Executor improvements

I have plans to implement some of them in separated patches, but not now.

-----------------
 Catalog changes
-----------------
A new system catalog "pg_partition" represents partition keys for each
table. A parent table of partitions has only one pg_partition row.
I think separated pg_partition table is better than adding these columns
to pg_class, but it might be debatable.

CREATE TABLE pg_partition (
    partrelid oid REFERENCES oid ON pg_class,    -- partitioned table oid
    partopr   oid REFERENCES oid ON pg_operator, -- operator to comapre keys
    partkind  "char", -- kind of partition: 'R' (range) or 'L' (list)
    partkey   text,   -- expression tree of partition key
    PRIMARY KEY (partrelid)
) WITHOUT OIDS;

In addition, we would need to store threshold values of child tables
somewhere, but under consideration. I'm thinking to extract upper
and lower bounds from CHECK constraint, but it might be unreliable.
Comments and ideas welcome.

----------------
 Syntax details
----------------
CREATE TABLE table (...) PARTITION BY RANGE ( expr [USING operator] )
[ (
    PARTITION name VALUES LESS THAN [(] upper [)],
    PARTITION name VALUES LESS THAN [(] MAXVALUE [)] -- overflow partition
  ) ] ;
-- default operator is '<' for range partitions

CREATE TABLE table (...) PARTITION BY LIST ( expr [USING operator] )
[ (
    PARTITION name VALUES [IN] ( values ),
    PARTITION name VALUES [IN] [(] DEFAULT [)] -- overflow partition
  ) ] ;
-- default operator is '=' for list partitions

ALTER TABLE table PARTITION BY { RANGE | LIST } ... ;
ALTER TABLE table NO PARTITION; -- drop partition key

CREATE PARTITION partition ON table VALUES LESS THAN ...; -- range
CREATE PARTITION partition ON table VALUES IN (...);      -- list

DROP PARTITION partition;                 -- synonym for DROP TABLE
ALTER PARTITION partition RENAME TO name; -- synonym for ALTER TABLE RENAME

Note:
  * Each partition can have optional WITH (...) and TABLESPACE clauses.
  * '(' and ')' are optional to support both Oracle and MySQL syntax.

-----------
 WIP patch
-----------
The attached partitioning_20091102.patch is a WIP patch. There are
still not implemented features marked with TODO tags, but I'll use
this design -- especially Node manipulations.
Please notice me if I'm missing something.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

pgsql-hackers by date:

Previous
From: Nikhil Sontakke
Date:
Subject: Re: Syntax for partitioning
Next
From: Fujii Masao
Date:
Subject: backup_label in a crash recovery