Re: On partitioning - Mailing list pgsql-hackers

From Andres Freund
Subject Re: On partitioning
Date
Msg-id 20141208195650.GA29205@alap3.anarazel.de
Whole thread Raw
In response to Re: On partitioning  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: On partitioning  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2014-12-08 14:48:50 -0500, Robert Haas wrote:
> On Mon, Dec 8, 2014 at 2:39 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> >> I guess I'm in disagreement with you - and, perhaps - the majority on
> >> this point.  I think that ship has already sailed: partitions ARE
> >> tables.  We can try to make it less necessary for users to ever look
> >> at those tables as separate objects, and I think that's a good idea.
> >> But trying to go from a system where partitions are tables, which is
> >> what we have today, to a system where they are not seems like a bad
> >> idea to me.  If we make a major break from how things work today,
> >> we're going to end up having to reimplement stuff that already works.
> >
> > I don't think this makes much sense. That'd severely restrict our
> > ability to do stuff for a long time. Unless we can absolutely rely on
> > the fact that partitions have the same schema and such we'll rob
> > ourselves of significant optimization opportunities.
> 
> I don't think that's mutually exclusive with the idea of
> partitions-as-tables.  I mean, you can add code to the ALTER TABLE
> path that says if (i_am_not_the_partitioning_root) ereport(ERROR, ...)
> wherever you want.

That'll be a lot of places you'll need to touch. More fundamentally: Why
should we name something a table that's not one?

> >> Besides, I haven't really seen anyone propose something that sounds
> >> like a credible alternative.  If we could make partition objects
> >> things that the storage layer needs to know about but the query
> >> planner doesn't need to understand, that'd be maybe worth considering.
> >> But I don't see any way that that's remotely feasible.  There are lots
> >> of places that we assume that a heap consists of blocks number 0 up
> >> through N: CTID pointers, index-to-heap pointers, nodeSeqScan, bits
> >> and pieces of the way index vacuuming is handled, which in turn bleeds
> >> into Hot Standby.  You can't just decide that now block numbers are
> >> going to be replaced by some more complex structure, or even that
> >> they're now going to be nonlinear, without breaking a huge amount of
> >> stuff.
> >
> > I think you're making a wrong fundamental assumption here. Just because
> > we define partitions to not be full relations doesn't mean we have to
> > treat them entirely separate. I don't see why a pg_class.relkind = 'p'
> > entry would be something actually problematic. That'd easily allow to
> > treat them differently in all the relevant places (all of ALTER TABLE,
> > DML et al) and still allow all of the current planner/executor
> > infrastructure. We can even allow direct SELECTs from individual
> > partitions if we want to - that's trivial to achieve.
> 
> We may just be using different words to talk about more-or-less the
> same thing, then.

That might be

> What I'm saying is that I want these things to keep working:

> - Indexes.

Nobody argued against that I think.

> - Merge append and any other inheritance-aware query planning
> techniques.

Same here.

> - Direct access to individual partitions to bypass
> tuple-routing/query-planning overhead.

I think that might be ok in some cases, but in general I'd be very wary
to allow that. I think it might be ok to allow direct read access, but
everything else I'd be opposed. I'd much rather go the route of allowing
to few things and then gradually opening up if required than the other
way round (as that pretty much will never happen because it'll break
deployed systems).

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Lockless StrategyGetBuffer() clock sweep
Next
From: Josh Berkus
Date:
Subject: Re: On partitioning