Re: On partitioning - Mailing list pgsql-hackers

From Amit Langote
Subject Re: On partitioning
Date
Msg-id 066501cffd52$3d919390$b8b4bab0$@lab.ntt.co.jp
Whole thread Raw
In response to Re: On partitioning  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: On partitioning  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Saturday, November 08, 2014 5:41 AM
>
> I'd be in favor of that.

Thanks!

>  I am not sure whether the code is close
> enough to what we need to be really useful, but that's for you to
> decide.

Hmm, I'm not entirely convinced about the patch as it stands either but, I will 
try to restate below what the patch in its current state does anyway (just to 
refresh):

The patch provides syntax to:* Specify partitioning key, optional partition definitions within CREATE TABLE,* A few
ALTERTABLE commands that let you define a partitioning key 
 
(partitioning a table after the fact), attach/detach an existing table as a 
partition of a partitioned table,* CREATE PARTITION to create a new partition on a partitioned table.

Above commands are merely transformed into ALTER TABLE subcommands that arrange 
partitioned table and partitions into inheritance hierarchy, but with extra 
information, that is, allowed values for the partition in a new anyarray column 
called 'pg_inherits.values'. A special case of ATExecAddInherit() namely 
ATExecAttachPartitionI(), as part of its processing, also adds partition 
constraints in the form of appropriate CHECK constraints. So, a few of the 
manual steps are automated and additional (IMHO non-opaque) metadata (namely 
partition boundaries/list values) is added.

Additionally, defining a partitioning key (PARTITION BY) creates a pg_partition 
entry that specifies for a partitioned table the following - partition kind 
(range/list),  an opclass for the key value comparison  and a key 'expression' 
(say, "colname % 10").

A few key things I can think of as needing improvement would be  (perhaps just 
reiterating a review of the patch):
* partition pruning would still depend on constraint exclusion using the CHECK 
constraints (same old)* there is no tuple-routing at all (same can be said of partition pruning 
above)* partition pruning or tuple-routing would require a scan over pg_inherits 
(perhaps inefficient)* partitioning key is an expression which might not be a good idea in early 
stages of the implementation (might be better off with just the attnum of the 
column to partition on?)* there is no DROP PARTITION (in fact, it is suggested not to go CREATE/DROP 
PARTITION route at all) -> ALTER TABLE ... ADD/DROP PARTITION?

Some other important ones:* dependency handling related oversights* constraint propagation related oversights

And then some of the oddities of behaviour that I am seeing while trying out 
things that the patch does. Please feel free to suggest those that I am not 
seeing. I am sure these improvements need more than just tablecmds.c hacking 
which is what the current patch mostly does.

The first two points could use separate follow-on patches as I feel they need 
extensive changes unless I am missing something. I will try to post possible 
solutions to these issues provided metadata in current form is OK to proceed.

> In my view, the main problem we should be trying to solve
> here is "avoid relying on constraint exclusion".  In other words, the
> syntax for adding a partition should put some metadata into the system
> catalogs that lets us do partitioning pruning very very quickly,
> without theorem-proving.  For example, for list or range partitioning,
> a list of partition bounds would be just right: you could
> binary-search it.  The same metadata should also be suitable for
> routing inserts to the proper partition, and handling partition motion
> when a tuple is updated.
>
> Now there's other stuff we might very well want to do, but I think
> making partition pruning and tuple routing fast would be a pretty big
> win by itself.
>

Those are definitely the goals worth striving for.

Thanks for your time.

Regards,
Amit






pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Proposal: Log inability to lock pages during vacuum
Next
From: Jim Nasby
Date:
Subject: Re: Add CREATE support to event triggers