Re: On partitioning - Mailing list pgsql-hackers

From Robert Haas
Subject Re: On partitioning
Date
Msg-id CA+Tgmobb2DCxLV+CJzUNeZWMnrQKkarh9fomfa=CogX89naryg@mail.gmail.com
Whole thread Raw
In response to Re: On partitioning  (Stephen Frost <sfrost@snowman.net>)
Responses Re: On partitioning  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Thu, Nov 13, 2014 at 1:39 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Agreed- a node tree seems a bit too far to make this really work well..
> But, I'm curious what you were thinking specifically?

I gave a pretty specific example in my email.

> A node tree which
> accepts an "argument" of the constant used in the original query and
> then spits back a table might work reasonably well for that case-

A node tree is not a function.  It's a data structure.  So it doesn't
have arguments.

> but
> with declarative partitioning, I expect us to eventually be able to
> eliminate complete partitions from consideration on both sides of a
> partition-table join and optimize cases where we have two partitioned
> tables being joined with a compatible join key and only actually do
> joins between the partitions which overlap each other.  I don't see
> those happening if we're allowing a node tree (only).  If having a node
> tree is just one option among other partitioning options, then we can
> provide users with the ability to choose what suits their particular
> needs.

This seems completely muddled to me.  What we're talking about is how
to represent the partition definition in the system catalogs.  I'm not
proposing that the user would "partition by pg_node_tree"; what the
heck would that even mean?  I'm proposing one way of serializing the
partition definitions that the user specifies into something that can
be stored into a system catalog, which happens to reuse the existing
infrastructure that we use for that same purpose in various other
places.  I don't have a problem with somebody coming up with another
way of representing the data in the catalogs; I'm just brainstorming.
But saying that we'll be able to optimize joins better if we store the
same data as anyarray rather than pg_node_tree or visca versa doesn't
make any sense at all.

> I'm not a fan of using pg_class- there are a number of columns in there
> which I would *not* wish to be allowed to be different per partition
> (starting with relowner and relacl...).  Making those NULL would be just
> as bad (probably worse, really, since we'd also need to add new columns
> to pg_class to indicate the partitioning...) as having a sparsely
> populated new catalog table.

I think you are, again, confused as to what we're discussing.  Nobody,
including Alvaro, has proposed a design where the individual
partitions don't have pg_class entries of some kind.  What we're
talking about is where to store the metadata for partition exclusion
and tuple routing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Michael Banck
Date:
Subject: Re: controlling psql's use of the pager a bit more
Next
From: Alvaro Herrera
Date:
Subject: Re: pg_basebackup vs. Windows and tablespaces