Re: On partitioning - Mailing list pgsql-hackers

From Robert Haas
Subject Re: On partitioning
Date
Msg-id CA+TgmoZXp8ed2RsN=BzkUtD-b2cCTxJOLK550v43i3usVwZNkA@mail.gmail.com
Whole thread Raw
In response to Re: On partitioning  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: On partitioning  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Mon, Dec 8, 2014 at 2:39 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> I guess I'm in disagreement with you - and, perhaps - the majority on
>> this point.  I think that ship has already sailed: partitions ARE
>> tables.  We can try to make it less necessary for users to ever look
>> at those tables as separate objects, and I think that's a good idea.
>> But trying to go from a system where partitions are tables, which is
>> what we have today, to a system where they are not seems like a bad
>> idea to me.  If we make a major break from how things work today,
>> we're going to end up having to reimplement stuff that already works.
>
> I don't think this makes much sense. That'd severely restrict our
> ability to do stuff for a long time. Unless we can absolutely rely on
> the fact that partitions have the same schema and such we'll rob
> ourselves of significant optimization opportunities.

I don't think that's mutually exclusive with the idea of
partitions-as-tables.  I mean, you can add code to the ALTER TABLE
path that says if (i_am_not_the_partitioning_root) ereport(ERROR, ...)
wherever you want.

>> Besides, I haven't really seen anyone propose something that sounds
>> like a credible alternative.  If we could make partition objects
>> things that the storage layer needs to know about but the query
>> planner doesn't need to understand, that'd be maybe worth considering.
>> But I don't see any way that that's remotely feasible.  There are lots
>> of places that we assume that a heap consists of blocks number 0 up
>> through N: CTID pointers, index-to-heap pointers, nodeSeqScan, bits
>> and pieces of the way index vacuuming is handled, which in turn bleeds
>> into Hot Standby.  You can't just decide that now block numbers are
>> going to be replaced by some more complex structure, or even that
>> they're now going to be nonlinear, without breaking a huge amount of
>> stuff.
>
> I think you're making a wrong fundamental assumption here. Just because
> we define partitions to not be full relations doesn't mean we have to
> treat them entirely separate. I don't see why a pg_class.relkind = 'p'
> entry would be something actually problematic. That'd easily allow to
> treat them differently in all the relevant places (all of ALTER TABLE,
> DML et al) and still allow all of the current planner/executor
> infrastructure. We can even allow direct SELECTs from individual
> partitions if we want to - that's trivial to achieve.

We may just be using different words to talk about more-or-less the
same thing, then.  What I'm saying is that I want these things to keep
working:

- Indexes.
- Merge append and any other inheritance-aware query planning techniques.
- Direct access to individual partitions to bypass
tuple-routing/query-planning overhead.

I am not necessarily saying that I have a problem with putting other
restrictions on partitions, like requiring them to have the same tuple
descriptor or the same ACLs as their parents.  Those kinds of details
bear discussion, but I'm not intrinsically opposed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: compiler warnings under MinGW for 9.4
Next
From: Andres Freund
Date:
Subject: Re: Lockless StrategyGetBuffer() clock sweep