Home > mailing lists

Re: On partitioning - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: On partitioning
Date	December 8, 2014 19:39:10
Msg-id	20141208193902.GA30157@alap3.anarazel.de Whole thread Raw
In response to	Re: On partitioning (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: On partitioning
List	pgsql-hackers

Tree view

On 2014-12-08 14:05:52 -0500, Robert Haas wrote:
> On Sat, Dec 6, 2014 at 3:06 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Sure, I don't feel we should not provide anyway to take dump
> > for individual partition but not at level of independent table.
> > May be something like --table <table_name>
> > --partition <partition_name>.
> >
> > In general, I think we should try to avoid exposing that partitions are
> > individual tables as that might hinder any future enhancement in that
> > area (example if we someone finds a different and better way to
> > arrange the partition data, then due to the currently exposed syntax,
> > we might feel blocked).
> 
> I guess I'm in disagreement with you - and, perhaps - the majority on
> this point.  I think that ship has already sailed: partitions ARE
> tables.  We can try to make it less necessary for users to ever look
> at those tables as separate objects, and I think that's a good idea.
> But trying to go from a system where partitions are tables, which is
> what we have today, to a system where they are not seems like a bad
> idea to me.  If we make a major break from how things work today,
> we're going to end up having to reimplement stuff that already works.

I don't think this makes much sense. That'd severely restrict our
ability to do stuff for a long time. Unless we can absolutely rely on
the fact that partitions have the same schema and such we'll rob
ourselves of significant optimization opportunities.

> Besides, I haven't really seen anyone propose something that sounds
> like a credible alternative.  If we could make partition objects
> things that the storage layer needs to know about but the query
> planner doesn't need to understand, that'd be maybe worth considering.
> But I don't see any way that that's remotely feasible.  There are lots
> of places that we assume that a heap consists of blocks number 0 up
> through N: CTID pointers, index-to-heap pointers, nodeSeqScan, bits
> and pieces of the way index vacuuming is handled, which in turn bleeds
> into Hot Standby.  You can't just decide that now block numbers are
> going to be replaced by some more complex structure, or even that
> they're now going to be nonlinear, without breaking a huge amount of
> stuff.

I think you're making a wrong fundamental assumption here. Just because
we define partitions to not be full relations doesn't mean we have to
treat them entirely separate. I don't see why a pg_class.relkind = 'p'
entry would be something actually problematic. That'd easily allow to
treat them differently in all the relevant places (all of ALTER TABLE,
DML et al) and still allow all of the current planner/executor
infrastructure. We can even allow direct SELECTs from individual
partitions if we want to - that's trivial to achieve.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

pgsql-hackers by date:

From: Robert Haas
Date: 08 December 2014, 19:37:52
Subject: Re: Compression of full-page-writes

From: Robert Haas
Date: 08 December 2014, 19:40:10
Subject: Re: On partitioning

Re: On partitioning - Mailing list pgsql-hackers

Previous

Next