Re: On partitioning - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: On partitioning
Date
Msg-id CAA4eK1JY-74Tabb4sLJ2PK+3bQ--DhK6tpVzv-Hkr0a_nPzBeQ@mail.gmail.com
Whole thread Raw
In response to Re: On partitioning  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Dec 11, 2014 at 8:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Dec 11, 2014 at 12:00 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Yeah either this way or what Josh has suggested upthread, the main
> > point was that if at all we want to support multi-column list partitioning
> > then we need to have slightly different syntax, however I feel that we
> > can leave multi-column list partitioning for first version.
>
> Yeah, possibly.
>
> I think we could stand to have a lot more discussion about the syntax
> here.  So far the idea seems to be to copy what Oracle has, but it's
> not clear if we're going to have exactly what Oracle has or something
> subtly different.  I personally don't find the Oracle syntax very
> PostgreSQL-ish.

I share your concern w.r.t the difficulties it can create if we don't
do it carefully (one of the issue you have mentioned upthread about
pg_dump, other such things could cause problems, if not thought
of carefully from the beginning).  One more thing, on a quick check
it seems to me even DB2 uses some-thing similar to Oracle for
defining partitions

CREATE TABLE orders(id INT, shipdate DATE, …) 
PARTITION BY RANGE(shipdate) 
( PARTITION q4_05 STARTING MINVALUE, 
  PARTITION q1_06 STARTING '1/1/2006',
  PARTITION q2_06 STARTING '4/1/2006',
  PARTITION q3_06 STARTING '7/1/2006',
  PARTITION q4_06 STARTING '10/1/2006' 
                              ENDING ‘12/31/2006' )

I don't think there is any pressing need for PostgreSQL to use
syntax similar to what some of the other databases use, however
it has an advantage for ease of migration and ease of use (as
people are already familiar with using such syntax).

> Stuff like "VALUES LESS THAN 500" doesn't sit
> especially well with me - less than according to which opclass?  Are
> we going to insist that partitioning must use the default btree
> opclass so that we can use that syntax?  That seems kind of lame.
>

Can't we simply specify the opclass along with column name while
specifying partition clause which I feel is something similar to we
already do in CREATE INDEX syntax.

CREATE TABLE sales
     ( invoice_no NUMBER,
       sale_year  INT NOT NULL,
       sale_month INT NOT NULL,
       sale_day   INT NOT NULL )
   PARTITION BY RANGE ( sale_year <opclass>)
     ( PARTITION sales_q1 VALUES LESS THAN (1999)

Isn't the default operator class for a partition column would fit the
bill for this particular case as the operators required in this syntax
will be quite simple?

> There are lots of interesting things we could do here, e.g.:
>
> CREATE TABLE parent_name PARTITION ON (column [ USING opclass ] [, ... ]);
> CREATE TABLE child_name PARTITION OF parent_name
>    FOR { (value, ...) [ TO (value, ...) ] } [, ...];
>

The only thing which slightly bothers me about this syntax is that
it makes apparent that partitions are separate tables and it would
be inconvenient if we choose to disallow some operations on
partitions.  I think it might be better we treat partitions as a way
to divide the large amount of data and users be only given the
option to specify boundaries to divide this data and storage mechanism
of partitions should be an internal detail (something like we do in
TOAST table case).  I am not sure which syntax users will be more
comfortable to use as I am seeing and using Oracle type syntax from
long time so my opinion could be biased in this case.  It would be really
helpful if others who need or use partitioning scheme can share their
inputs.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: "Amit Langote"
Date:
Subject: Re: On partitioning
Next
From: Etsuro Fujita
Date:
Subject: Re: [Bug] Inconsistent result for inheritance and FOR UPDATE.